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ISOLATED HUMAN TRANSPORTER PROTEINS, NUCLEIC ACID 
MOLECULES ENCODING HUMAN TRANSPORTER PROTEINS, 

AND USES THEREOF 

RELATED APPLICATIONS 

The present application is a Continuation-In-Part of US Serial No. 09/730,002, 
filed December 8, 2000 (Atty. Docket CL001066). 

FIELD OF THE INVENTION 

The present invention is in the field of transporter proteins that are related to the 
gamma-ammobutyric-acid receptor subfamily, recombinant DNA molecules, and protein 
production. The present invention specifically provides novel peptides and proteins that 
effect ligand transport and nucleic acid molecules encoding such peptide and protein 
molecules, all of which are useful in the development of human therapeutics and 
diagnostic compositions and methods. 

BACKGROUND OF THE INVENTION 

Transporters 

Transporter proteins regulate many different functions of a cell, including cell 
proliferation, differentiation, and signaling processes, by regulating the flow of molecules 
such as ions and macromolecules, into and out of cells. Transporters are found in the 
plasma membranes of virtually every cell in eukaryotic organisms. Transporters mediate 
a variety of cellular functions including regulation of membrane potentials and absorption 
and secretion of molecules and ion across cell membranes. When present in intracellular 
membranes of the Golgi apparatus and endocytic vesicles, transporters, such as chloride 
channels, also regulate organelle pH. For a review, see Greger, R. (1988) Annu* Rev. 
Physiol. 50:111-122. 

Transporters are generally classified by structure and the type of mode of action. 
In addition, transporters are sometimes classified by the molecule type that is transported, 
for example, sugar transporters, chlorine channels, potassium channels, etc. There may 
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be many classes of channels for transporting a single type of molecule (a detailed review 
of channel types can be found at Alexander, S.P.H. and J. A. Peters: Receptor and 
transporter nomenclature supplement. Trends Pharmacol ScL, Elsevier, pp. 65-68 (1997) 
and http://www-biology,ucsd.edu/-msaier/tran s port/titlepage2.htmL 
5 The following general classification scheme is known in the art and is followed in 

the present discoveries. 

Channel-type transporters. Transmembrane channel proteins of this class are 
ubiquitously found in the membranes of all types of organisms from bacteria to higher 
eukaryotes. Transport systems of this type catalyze facilitated diffusion (by an energy- 
1%' independent process) by passage through a transmembrane aqueous pore or channel 
3 without evidence for a carrier-mediated mechanism. These channel proteins usually 
gp consist largely of a-helical spanners, although b-strands may also be present and may 
^ even comprise the channel. However, outer membrane porin-type channel proteins are 
-4 excluded from this class and are instead included in class 9. 
tS Carrier-type transporters. Transport systems are included in this class if they 

■ff utilize a carrier-mediated process to catalyze uniport (a single species is transported by 
3 facilitated diffusion), antiport (two or more species are transported in opposite directions 
^ in a tightly coupled process, not coupled to a direct form of energy other than 

chemiosmotic energy) and/or symport (two or more species are transported together in 
20 the same direction in a tightly coupled process, not coupled to a direct form of energy 
other than chemiosmotic energy). 

Pyrophosphate bond hydrolysis-driven active transporters. Transport systems are 
included in this class if they hydrolyze pyrophosphate or the terminal pyrophosphate 
bond in ATP or another nucleoside triphosphate to drive the active uptake and/or 
25 extrusion of a solute or solutes. The transport protein may or may not be transiently 
phosphorylated, but the substrate is not phosphorylated. 

PEP-dependent, phosphoryl transfer-driven group translocators. Transport 
systems of the bacterial phosphoenolpyruvate:sugar phosphotransferase system are 
included in this class. The product of the reaction, derived from extracellular sugar, is a 
30 cytoplasmic sugar-phosphate. 
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Decarboxylation-driven active transporters. Transport systems that drive solute 
(e.g., ion) uptake or extrusion by decarboxylation of a cytoplasmic substrate are included 
in this class. 

Oxidoreduction-driven active transporters. Transport systems that drive transport 
5 of a solute (e.g., an ion) energized by the flow of electrons from a reduced substrate to an 
oxidized substrate are included in this class. 

Light-driven active transporters. Transport systems that utilize light energy to 
drive transport of a solute (e.g., an ion) are included in this class. 

Mechanically-driven active transporters. Transport systems are included in this 
lTQj class if they drive movement of a cell or organelle by allowing the flow of ions (or other 
^ solutes) through the membrane down their electrochemical gradients, 
g; Outer-membrane porins (of b-structure). These proteins form transmembrane 

pores or channels that usually allow the energy independent passage of solutes across a 
-1 membrane. The transmembrane portions of these proteins consist exclusively of b-strands 
M that form a b-barrel. These porin-type proteins are found in the outer membranes of 
f; Gram-negative bacteria, mitochondria and eukaryotic plastids. 
ffi Methyltransferase-driven active transporters. A single characterized protein 

IT currently falls into this category, the Na+-transporting 

methyltetrahydromethanopterinxoenzyme M methyltransferase. 
20 Non-ribosome-synthesized channel-forming peptides or peptide-like molecules. 

These molecules, usually chains of L- and D-amino acids as well as other small 
molecular building blocks such as lactate, form oligomeric transmembrane ion channels. 
Voltage may induce channel formation by promoting assembly of the transmembrane 
channel. These peptides are often made by bacteria and fungi as agents of biological 
25 warfare. 

Non-Proteinaceous Transport Complexes. Ion conducting substances in biological 
membranes that do not consist of or are not derived from proteins or peptides fall into this 
category. 

Functionally characterized transporters for which sequence data are lacking. 
30 Transporters of particular physiological significance will be included in this category 
even though a family assignment cannot be made. 
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Putative transporters in which no family member is an established transporter. 
Putative transport protein families are grouped under this number and will either be 
classified elsewhere when the transport function of a member becomes established, or 
will be eliminated from the TC classification system if the proposed transport function is 
5 disproven. These families include a member or members for which a transport function 
has been suggested, but evidence for such a function is not yet compelling. 

Auxiliary transport proteins. Proteins that in some way facilitate transport across 
one or more biological membranes but do not themselves participate directly in transport 
are included in this class. These proteins always function in conjunction with one or more 
lfe transport proteins. They may provide a function connected with energy coupling to 
5) transport, play a structural role in complex formation or serve a regulatory function, 
fn Transporters of unknown classification. Transport protein families of unknown 

;t! classification are grouped under this number and will be classified elsewhere when the 
%J transport process and energy coupling mechanism are characterized. These families 
ft include at least one member for which a transport function has been established, but 
m either the mode of transport or the energy coupling mechanism is not known. 

Ji! Ion channels 

^ An important type of transporter is the ion channel. Ion channels regulate many 

different cell proliferation, differentiation, and signaling processes by regulating the flow 

20 of ions into and out of cells. Ion channels are found in the plasma membranes of virtually 
every cell in eukaryotic organisms. Ion channels mediate a variety of cellular functions 
including regulation of membrane potentials and absorption and secretion of ion across 
epithelial membranes. When present in intracellular membranes of the Golgi apparatus 
and endocytic vesicles, ion channels, such as chloride channels, also regulate organelle 

25 pH. For a review, see Greger, R. (1988) Annu. Rev. Physiol. 50:1 1 1-122. 

Ion channels are generally classified by structure and the type of mode of action. 
For example, extracellular ligand gated channels (ELGs) are comprised of five 
polypeptide subunits, with each subunit having 4 membrane spanning domains, and are 
activated by the binding of an extracellular ligand to the channel. In addition, channels 

30 are sometimes classified by the ion type that is transported, for example, chlorine 
channels, potassium channels, etc. There may be many classes of channels for 
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transporting a single type of ion (a detailed review of channel types can be found at 
Alexander, S.P.BL and J A. Peters (1997). Receptor and ion channel nomenclature 
supplement. Trends Pharmacol Sci., Elsevier, pp. 65-68 and http://www- 
biology.ucsd.edu/-msaier/transport/toc.htmL 
5 There are many types of ion channels based on structure. For example, many ion 

channels fall within one of the following groups: extracellular ligand-gated channels 
(ELG), intracellular ligand-gated channels (ILG), inward rectifying channels (INR), 
intercellular (gap junction) channels, and voltage gated channels (VIC). There are 
additionally recognized other channel families based on ion-type transported, cellular 
QD location and drug sensitivity. Detailed information on each of these, their activity, ligand 
El type, ion type, disease association, drugability, and other information pertinent to the 
ri present invention, is well known in the art. 

£p Extracellular ligand-gated channels, ELGs, are generally comprised of five 

SI polypeptide subunits, Unwin, N. (1993), Cell 72: 31-41; Unwin, N. (1995), Nature 373: 
15 37-43; Hucho, K, et al., (1996) J. Neurochem. 66: 1781-1792; Hucho, R, et aL, (1996) 
yj Eur. J. Biochem. 239: 539-557; Alexander, S.P.H. and J.A. Peters (1997), Trends 
i| Pharmacol. Sci., Elsevier, pp. 4-6; 36-40; 42-44; and Xue, H. (1998) J. Mol. Evol. 47: 
323-333. Each subunit has 4 membrane spanning regions: this serves as a means of 
identifying other members of the ELG family of proteins. ELG bind a ligand and in 
20 response modulate the flow of ions. Examples of ELG include most members of the 
neurotransmitter-receptor family of proteins, e.g., GABAI receptors. Other members of 
this family of ion channels include glycine receptors, ryandyne receptors, and ligand 
gated calcium channels. 

The Voltage-gated Ion Channel (VIC) Superfamily 

25 Proteins of the VIC family are ion-selective channel proteins found in a wide 

range of bacteria, archaea and eukaryotes Hille, B. (1992), Chapter 9: Structure of 
channel proteins; Chapter 20: Evolution and diversity. In: Ionic Channels of Excitable 
Membranes, 2nd Ed., Sinaur Assoc. Inc., Pubs., Sunderland, Massachusetts; Sigworth, 
FJ. (1993), Quart. Rev. Biophys. 27: 1-40; Salkoff, L. and T. Jegla (1995), Neuron 15: 

30 489-492; Alexander, S.P.H. et al, (1997), Trends Pharmacol. Sci., Elsevier, pp. 76-84; 
Jan, L.Y. et aL, (1997), Annu. Rev. Neurosci. 20: 91-123; Doyle, D.A, et al., (1998) 
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Science 280: 69-77; Terlau, H. and W. Sttthmer (1998), Naturwissenschaften 85: 437- 
444. They are often homo- or heterooligomeric structures with several dissimilar 
subunits (e.g., al-a2-d-b Ca 2+ channels, abib 2 Na + channels or (a) 4 -b K + channels), but 
the channel and the primary receptor is usually associated with the a (or al) subunit. 
5 Functionally characterized members are specific for K + , Na + or Ca 2+ . The K + channels 
usually consist of homotetrameric structures with each a-subunit possessing six 
transmembrane spanners (TMSs). The al and a subunits of the Ca 2+ and Na + channels, 
respectively, are about four times as large and possess 4 units, each with 6 TMSs 
separated by a hydrophilic loop, for a total of 24 TMSs. These large channel proteins 
01 form heterotetra-unit structures equivalent to the homotetrameric structures of most K + 
U channels. All four units of the Ca 2+ and Na + channels are homologous to the single unit in 
t: the homotetrameric K + channels. Ion flux via the eukaryotic channels is generally 
S controlled by the transmembrane electrical potential (hence the designation, voltage- 
il sensitive) although some are controlled by ligand or receptor binding. 
15 Several putative K + -selective channel proteins of the VIC family have been 

3 identified in prokaryotes. The structure of one of them, the KcsA K + channel of 
« Streptomyces lividans, has been solved to 3.2 A resolution. The protein possesses four 
=3 identical subunits, each with two transmembrane helices, arranged in the shape of an 

inverted teepee or cone. The cone cradles the "selectivity filter" P domain in its outer end. 
20 The narrow selectivity filter is only 12 A long, whereas the remainder of the channel is 
wider and lined with hydrophobic residues. A large water-filled cavity and helix dipoles 
stabilize K + in the pore. The selectivity filter has two bound K + ions about 7.5 A apart 
from each other. Ion conduction is proposed to result from a balance of electrostatic 
attractive and repulsive forces. 
25 In eukaryotes, each VIC family channel type has several subtypes based on 

pharmacological and electrophysiological data. Thus, there are five types of Ca 2+ 
channels (L, N, P, Q and T). There are at least ten types of K + channels, each responding 
in different ways to different stimuli: voltage-sensitive [Ka, Kv, Kvr, Kvs and Ksr], Ca - 
sensitive [BKc a , IKc a and SKc a ] and receptor-coupled [K M and K AC h]. There are at least 
30 six types of Na* channels (I, n, HI, ul, HI and PN3). Tetrameric channels from both 
prokaryotic and eukaryotic organisms are known in which each a-subunit possesses 2 
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TMSs rather than 6, and these two TMSs are homologous to TMSs 5 and 6 of the six 
TMS unit found in the voltage-sensitive channel proteins. KcsA of S. lividans is an 
example of such a 2 TMS channel protein. These channels may include the K Na (Na + - 
activated) and K Vo i (cell volume-sensitive) K + channels, as well as distantly related 
5 channels such as the Tokl K + channel of yeast, the TWK-1 inward rectifier K + channel 
of the mouse and the TREK-1 K + channel of the mouse. Because of insufficient sequence 
similarity with proteins of the VIC family, inward rectifier K + IRK channels (ATP- 
regulated; G-protein-activated) which possess a P domain and two flanking TMSs are 
placed in a distinct family. However, substantial sequence similarity in the P region 
tfi) suggests that they are homologous. The b, g and d subunits of VIC family members, 
M when present, frequently play regulatory roles in channel activation/deactivation. 

jj The Epithelial Na + Channel (ENaC) Family 

% The ENaC family consists of over twenty-four sequenced proteins (Canessa, 

%l CM., et al., (1994), Nature 367: 463-467, Le, T. and M.H. Saier, Jr. (1 996), Mol. 

05 Membr. Biol. 13: 149-157; Garty, H. and L.G. Palmer (1997), Physiol. Rev. 77: 359-396; 

]{j Waldmann, R., et al., (1997), Nature 386: 173-177; Darboux, I., et al., (1998), J. Biol. 

1 Chem. 273: 9424-9429; Firsov, D., et al., (1998), EMBO J. 17: 344-352; Horisberger, J.- 

p D. (1998). Curr. Opin. Struc. Biol. 10: 443-449). All are from animals with no 

recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins 

20 from epithelial cells cluster tightly together on the phylogenetic tree: voltage-insensitive 
ENaC homologues are also found in the brain. Eleven sequenced C. elegans proteins, 
including the degenerins, are distantly related to the vertebrate proteins as well as to each 
other. At least some of these proteins form part of a mechano-transducing complex for 
touch sensitivity. The homologous Helix aspersa (FMRF-amide)-activated Na + channel is 
25 the first peptide neurotransmitter-gated ionotropic receptor to be sequenced. 

Protein members of this family all exhibit the same apparent topology, each with 
N- and C-termini on the inside of the cell, two amphipathic transmembrane spanning 
segments, and a large extracellular loop. The extracellular domains contain numerous 
highly conserved cysteine residues. They are proposed to serve a receptor function. 
30 Mammalian ENaC is important for the maintenance of Na + balance and the 

regulation of blood pressure. Three homologous ENaC subunits, alpha, beta, and gamma, 
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have been shown to assemble to form the highly Na + -selective channel. The 
stoichiometry of the three subunits is alpha2,betal, gammal in a heterotetrameric 
architecture. 

The Glutamate- gated Ion Channel (G1C) Family of Neurotransmitter Receptors 

5 Members of the GIC family are heteropentameric complexes in which each of the 

5 subunits is of 800-1000 amino acyl residues in length (Nakanishi, N., et al, (1990), 
Neuron 5: 569-581; Unwin, N. (1993), Cell 72: 31-41; Alexander, S.P.H. and J.A. Peters 
(1997) Trends Pharmacol ScL, Elsevier, pp. 36-40). These subunits may span the 
membrane three or five times as putative a-helices with the N-termini (the glutamate- 
f| binding domains) localized extracellularly and the C-termini localized cytoplasmically. 
?7 They may be distantly related to the ligand-gated ion channels, and if so, they may 
EG possess substantial b-structure in their transmembrane regions. However, homology 
111 between these two families cannot be established on the basis of sequence comparisons 
" alone. The subunits fall into six subfamilies: a, b, g, d, e and z. 

315 The GIC channels are divided into three types: (1) a-amino-3-hydroxy-5-methyl- 

nl 4-isoxazole propionate (AMPA)-, (2) kainate- and (3) N-methyl-D-aspartate (NMDA)- 
2 selective glutamate receptors. Subunits of the AMPA and kainate classes exhibit 35-40% 
M= identity with each other while subunits of the NMD A receptors exhibit 22-24% identity 
with the former subunits. They possess large N-terminal, extracellular glutamate-binding 
20 domains that are homologous to the periplasmic glutamine and glutamate receptors of 
ABC-type uptake permeases of Gram-negative bacteria. All known members of the GIC 
family are from animals. The different channel (receptor) types exhibit distinct ion 
selectivities and conductance properties. The NMDA-selective large conductance 
channels are highly permeable to monovalent cations and Ca 2+ . The AMPA- and kainate- 
25 selective ion channels are permeable primarily to monovalent cations with only low 
permeability to Ca 2+ . 

The Chloride Channel (CIO Family 

The C1C family is a large family consisting of dozens of sequenced proteins 
derived from Gram-negative and Gram-positive bacteria, cyanobacteria, archaea, yeast, 
30 plants and animals (Steinmeyer, K., et al., (1991), Nature 354: 301-304; Uchida, S., et aL, 
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(1993), J. Biol. Chem. 268: 3821-3824; Huang, M.-R, et al., (1994), J. Mol. Biol. 242: 
595-598; Kawasaki, M., et al, (1994), Neuron 12: 597-604; Fisher, W.E., et aL, (1995), 
Genomics. 29:598-606; and Foskett, LK. (1998), Annu. Rev. Physiol. 60: 689-717). 
These proteins are essentially ubiquitous, although they are not encoded within genomes 
5 of Haemophilus influenzae, Mycoplasma genitalium, and Mycoplasma pneumoniae. 
Sequenced proteins vary in size from 395 amino acyl residues (M. jannaschii) to 988 
residues (man). Several organisms contain multiple C1C family paralogues. For example, 
Synechocystis has two paralogues, one of 451 residues in length and the other of 899 
residues. Arabidopsis thaliana has at least four sequenced paralogues, (775-792 
W residues), humans also have at least five paralogues (820-988 residues), and C. elegans 
ft\ also has at least five (810-950 residues). There are nine known members in mammals, 
rl and mutations in three of the corresponding genes cause human diseases. E. coli, 
01 Methanococcus jannaschii and Saccharomyces cerevisiae only have one C1C family 
Zj member each. With the exception of the larger Synechocystis paralogue, all bacterial 
15 proteins are small (395-492 residues) while all eukaryotic proteins are larger (687-988 
yj residues). These proteins exhibit 10-12 putative transmembrane a-helical spanners 
1=1 (TMSs) and appear to be present in the membrane as homodimers. While one member of 
p the family, Torpedo CIC-O, has been reported to have two channels, one per subunit, 

others are believed to have just one. 
20 All functionally characterized members of the C1C family transport chloride, 

some in a voltage-regulated process. These channels serve a variety of physiological 
functions (cell volume regulation; membrane potential stabilization; signal transduction; 
transepithelial transport, etc.). Different homologues in humans exhibit differing anion 
selectivities, i.e., C1C4 and C1C5 share a NO3" > Cf > Br" > T conductance sequence, 
25 while C1C3 has an Y > CI" selectivity. The C1C4 and C1C5 channels and others exhibit 
outward rectifying currents with currents only at voltages more positive than -K20mV. 

Animal Inward Rectifier K + Channel (IRK-C) Family 

IRK channels possess the "minimal channel-forming structure" with only a P 
domain, characteristic of the channel proteins of the VIC family, and two flanking 
30 transmembrane spanners (Shuck, M.E., et al., (1994), J. Biol. Chem. 269: 24261-24270; 
Ashen, M.D., et al., (1995), Am. J. Physiol. 268; H506-H511; Salkoff, L. and T. Jegla 
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(1995), Neuron 15: 489-492; Aguilar-Bryan, L., et al., (1998), Physiol. Rev. 78: 227-245; 
Ruknudin, A., et al., (1998), J. Biol. Chem. 273: 14165-14171). They may exist in the 
membrane as homo- or heterooligomers. They have a greater tendency to let K + flow into 

the cell than out. Voltage-dependence may be regulated by external K , by internal Mg , 
5 by internal ATP and/or by G-proteins. The P domains of IRK channels exhibit limited 
sequence similarity to those of the VIC family, but this sequence similarity is insufficient 
to establish homology. Inward rectifiers play a role in setting cellular membrane 
potentials, and the closing of these channels upon depolarization permits the occurrence 
of long duration action potentials with a plateau phase. Inward rectifiers lack the intrinsic 
K> voltage sensing helices found in VIC family channels. In a few cases, those of Kir 1 . 1 a 
03 and Kir6.2, for example, direct interaction with a member of the ABC superfamily has 
been proposed to confer unique functional and regulatory properties to the heteromeric 
; Tr complex, including sensitivity to ATP. The SUR1 sulfonylurea receptor (spQ09428) is 
Cj the ABC protein that regulates the Kir6.2 channel in response to ATP, and CFTR may 
J!l.5 regulate Kir 1 . 1 a. Mutations in SUR1 are the cause of familial persistent hyperinsulinemic 
yj hypoglycemia in infancy (PHHT), an autosomal recessive disorder characterized by 
m unregulated insulin secretion in the pancreas. 

2 ATP-gated Cation Channel (ACQ Family 

Members of the ACC family (also called P2X receptors) respond to ATP, a 
20 functional neurotransmitter released by exocytosis from many types of neurons (North, 
R.A. (1996), Curr. Opin. Cell Biol. 8: 474-483; Soto, F., M. Garcia-Guzman and W. 
Stiihmer (1997), J. Membr. Biol. 160: 91-100). They have been placed into seven groups 
(P2Xi - P2X7) based on their pharmacological properties. These channels, which function 
at neuron-neuron and neuron-smooth muscle junctions, may play roles in the control of 
25 blood pressure and pain sensation. They may also function in lymphocyte and platelet 
physiology. They are found only in animals. 

The proteins of the ACC family are quite similar in sequence (>35% identity), but 
they possess 380-1000 amino acyl residues per subunit with variability in length localized 
primarily to the C-terminal domains. They possess two transmembrane spanners, one 
30 about 30-50 residues from their N-termini, the other near residues 320-340. The 

extracellular receptor domains between these two spanners (of about 270 residues) are 
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well conserved with numerous conserved glycyl and cysteyl residues. The hydrophilic C- 
termini vary in length from 25 to 240 residues. They resemble the topologically similar 
epithelial Na + channel (ENaC) proteins in possessing (a) N- and C-termini localized 
intracellularly, (b) two putative transmembrane spanners, (c) a large extracellular loop 
domain, and (d) many conserved extracellular cysteyl residues. ACC family members 
are, however, not demonstrably homologous with them. ACC channels are probably 
hetero- or homomultimers and transport small monovalent cations (Me + ). Some also 
transport Ca 2+ ; a few also transport small metabolites. 

The Ryanodine-Inositol 1.4.5-trinhosphate Receptor Ca 2+ Channel fRIR-CaC) 

Family 

Ryanodine (Ry)-sensitive and inositol 1,4,5-triphosphate (IP3)-sensitive Ca - 
release channels function in the release of Ca 2+ from intracellular storage sites in animal 
cells and thereby regulate various Ca 2+ -dependent physiological processes (Hasan, G. et 
al., (1992) Development 116: 967-975; Michikawa, T., et al, (1994), J. Biol. Chem. 269: 
9184-9189; Tunwell, R.E.A., (1996), Biochem. J. 318: 477-487; Lee, A.G. (1996) 
Biomembranes, Vol. 6, Transmembrane Receptors and Channels (A.G. Lee, ed.), JAI 
Press, Denver, CO., pp 291-326; Mikoshiba, K., et al., (1996) J. Biochem. Biomem. 6: 
273-289). Ry receptors occur primarily in muscle cell sarcoplasmic reticular (SR) 
membranes, and IP3 receptors occur primarily in brain cell endoplasmic reticular (ER) 
membranes where they effect release of Ca 2+ into the cytoplasm upon activation 
(opening) of the channel. 

The Ry receptors are activated as a result of the activity of dihydropyridine- 
sensitive Ca 2+ channels. The latter are members of the voltage-sensitive ion channel 
(VIC) family. Dihydropyridine-sensitive channels are present in the T-tubular systems of 
muscle tissues. 

Ry receptors are homotetrameric complexes with each subunit exhibiting a 
molecular size of over 500,000 daltons (about 5,000 amino acyl residues). They possess 
C-terminal domains with six putative transmembrane a -helical spanners (TMSs). 
Putative pore-forming sequences occur between the fifth and sixth TMSs as suggested for 
members of the VIC family. The large N-terminal hydrophilic domains and the small C- 
terminal hydrophilic domains are localized to the cytoplasm. Low resolution 3- 
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dimensional structural data are available. Mammals possess at least three isoforms that 
probably arose by gene duplication and divergence before divergence of the mammalian 
species. Homologues are present in humans and Caenorabditis elegans. 
IP3 receptors resemble Ry receptors in many respects. (1) They are 
5 homotetrameric complexes with each subunit exhibiting a molecular size of over 300,000 
daltons (about 2,700 amino acyl residues). (2) They possess C-terminal channel domains 
that are homologous to those of the Ry receptors. (3) The channel domains possess six 
putative TMSs and a putative channel lining region between TMSs 5 and 6. (4) Both the 
large N-terminal domains and the smaller C-terminal tails face the cytoplasm. (5) They 
|J) possess covalently linked carbohydrate on extracytoplasmic loops of the channel 
fi domains. (6) They have three currently recognized isoforms (types 1 , 2, and 3) in 
U mammals which are subject to differential regulation and have different tissue 
S distributions. 

ill P3 receptors possess three domains: N-terminal IP3-binding domains, central 

,15 coupling or regulatory domains and C-terminal channel domains. Channels are activated 
H by IP3 binding, and like the Ry receptors, the activities of the IP3 receptor channels are 

regulated by phosphorylation of the regulatory domains, catalyzed by various protein 
~ s kinases. They predominate in the endoplasmic reticular membranes of various cell types 
^ in the brain but have also been found in the plasma membranes of some nerve cells 
20 derived from a variety of tissues. 

The channel domains of the Ry and IP3 receptors comprise a coherent family that 
in spite of apparent structural similarities, do not show appreciable sequence similarity of 
the proteins of the VIC family. The Ry receptors and the IP3 receptors cluster separately 
on the RIR-CaC family tree. They both have homologues in Drosophila. Based on the 
25 phylogenetic tree for the family, the family probably evolved in the following sequence: 
(1) A gene duplication event occurred that gave rise to Ry and IP 3 receptors in 
invertebrates. (2) Vertebrates evolved from invertebrates. (3) The three isoforms of each 
receptor arose as a result of two distinct gene duplication events. (4) These isoforms were 
transmitted to mammals before divergence of the mammalian species. 
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The Organellar Chloride Channel (O-CIC) Family 

Proteins of the O-CIC family are voltage-sensitive chloride channels found in 
intracellular membranes but not the plasma membranes of animal cells (Landry, D, et al., 
(1993), J. BioL Chem. 268: 14948-14955; Valenzuela, Set al., (1997), J. Biol. Chem. 272: 
5 12575-12582; and Duncan, R.R., et al., (1997), J. Biol. Chem. 272: 23880-23886). 

They are found in human nuclear membranes, and the bovine protein targets to 
the microsomes, but not the plasma membrane, when expressed in Xenopus laevis 
oocytes. These proteins are thought to function in the regulation of the membrane 
potential and in transepithelial ion absorption and secretion in the kidney. They possess 
if) two putative transmembrane a-helical spanners (TMSs) with cytoplasmic N- and C- 
03 termini and a large luminal loop that may be glycosylated. The bovine protein is 437 
fp amino acyl residues in length and has the two putative TMSs at positions 223-239 and 
Hi 367-385. The human nuclear protein is much smaller (241 residues). A C elegans 
%J homologue is 260 residues long. 

;^5 

y Gamma- Aminobutyric acidA receptors 

fo Gamma-Aminobutyric acid A (GABAA) receptors are multisubunit ligand-gated 

£7 ion channels which mediate neuronal inhibition by GABA and are composed of at least 
four subunit types (alpha, beta, gamma, and delta). The gamma 2-subunit appears to be 

20 essential for benzodiazepine modulation of GABAA receptor function. The gamma- 
aminobutyric acid receptors are the major inhibitory neurotransmitter receptors in the 
brain and the site of action of a number of important pharmacologic agents including 
barbiturates, benzodiazepines, and ethanol. The gamma- 1 and gamma-2 subunits are 
important in mediating responses to benzodiazepines, and a splicing variant of the 

25 gamma-2 subunit, gamma-2L, is necessary for ethanol actions on the receptor, raising the 
possibility that the gamma-2 gene may be involved in human genetic predisposition to 
the development of alcoholism. Wilcox et al (1992) assigned genes encoding the 
gamma- 1 and gamma-2 subunits of the GABA(A) receptor to chromosomes 4 and 5, 
respectively, by PCR amplification of human-specific products from human-hamster 
30 somatic cell hybrid DNAs. Using panels of chromosome-specific natural deletion 

hybrids, Wilcox et al. (1992) further localized the gamma-1 gene (GABRG1) to 4pl4- 

13 
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q21.1 and the gamma-2 gene (GABRG2) to 5q31.1-q33.2. These data suggested that the 
GABRG1 gene may be clustered with the previously mapped GABRA2 and GABRB1 
gene on chromosome 4 and that the GABRG2 gene may be close to the previously 
localized GABRA1 gene on chromosome 5. To test this further, the GABRA1 gene was 
5 mapped using the chromosome 5 deletion hybrids and shown to be within the same 
region as the GABRG2 gene, 5q31.1-q33.2. By means of a PCR-based screening 
strategy, a 450-kb human genomic YAC clone containing both the GABRA1 and the 
GABRG2 genes was isolated. Pulsed field gel restriction mapping of this YAC indicated 
that the 2 genes are within 200 kb of each other, 
tf GABA-A ligand-gated channels complex selectively with dopamine D5 receptors 

^ through the direct binding of the D5 carboxy-terminal domain with the second 
^ intracellular loop of the GABA-A gamma-2 (short) receptor subunit. This physical 
pi association enables mutual inhibitory functional interactions between these receptor 
J" j systems. Therefore it involves in an unknown signal transduction mechanism whereby 
45 subtype-selective G protein-coupled receptors dynamically regulate synaptic strength 
hi independently of classically defined second-messenger systems, and suggest a possible 
1: framework in which to view these receptor systems in the maintenance of psychomotor 
O disease states, particularly schizophrenia. 

For a review related to the GABA-A receptor, see reference of Ymer et al, 
20 EMBO J 1990 Oct;9(10):3261-7, Wang et al, Brain Res Bull 1998;45(4):421-5, 

Glencorse et al, JNeurochem 1993 Dec;61(6):2294-302, Kofuji et al, JNeurochem 
1991 Feb;56(2):713-5 . Shivers et al. f Neuron 1989 Sep;3(3):327-37, Wilcox et al, Proc 
Natl Acad Sci USA 1992 Jul 1;89(13);5857-61, Whiting etal, Proc Natl Acad Sci USA 
1990 Dec;87(24):9966-70. 
25 Transporter proteins, particularly members of the gamma-aminobutyric-acid 

receptor subfamily, are a major target for drug action and development. Accordingly, it is 
valuable to the field of pharmaceutical development to identify and characterize previously 
unknown transport proteins. The present invention advances the state of the art by 
providing previously unidentified human transport proteins. 
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SUMMARY OF THE INVENTION 

The present invention is based in part on the identification of amino acid 
sequences of human transporter peptides and proteins that are related to the gamma- 
aminobutyric-acid receptor subfamily, as well as allelic variants and other mammalian 
orthologs thereof. These unique peptide sequences, and nucleic acid sequences that 
encode these peptides, can be used as models for the development of human therapeutic 
targets, aid in the identification of therapeutic proteins, and serve as targets for the 
development of human therapeutic agents that modulate transporter activity in cells and 
tissues that express the transporter. Experimental data as provided in Figure 1 indicates 
expression in the brain. 

DESCRIPTION OF THE FIGURE SHEETS 

FIGURE 1 provides the nucleotide sequence of a cDNA molecule (SEQ ID NO:l) 
that encodes the transporter protein of the present invention. In addition structure and 
functional information is provided, such as ATG start, stop and tissue distribution, where 
available, that allows one to readily determine specific uses of inventions based on this 
molecular sequence. Experimental data as provided in Figure 1 indicates expression in 
the brain. 

FIGURE 2 provides the predicted amino acid sequence (SEQ ID NO:2) of the 
transporter of the present invention. Li addition structure and functional information such 
as protein family, function, and modification sites is provided where available, allowing 
one to readily determine specific uses of inventions based on this molecular sequence. 

FIGURE 3 provides genomic sequences (SEQ ID NO:3) that span the gene 
encoding the transporter protein of the present invention. In addition structure and 
functional information, such as intron/exon structure, promoter location, etc., is provided 
where available, allowing one to readily determine specific uses of inventions based on 
this molecular sequence. 1 13 SNPs, including 32 indels, have been identified in the gene 
encoding the transporter protein provided by the present invention and are given in 
Figure 3. 
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DETAILED DESCRIPTION OF THE INVENTION 

General Description 

The present invention is based on the sequencing of the human genome. During 
the sequencing and assembly of the human genome, analysis of the sequence information 
revealed previously unidentified fragments of the human genome that encode peptides 
that share structural and/or sequence homology to protein/peptide/domains identified and 
characterized within the art as being a transporter protein or part of a transporter protein 
and are related to the gamma-aminobutyric-acid receptor subfamily. Utilizing these 
sequences, additional genomic sequences were assembled and transcript and/or cDNA 
sequences were isolated and characterized. Based on this analysis, the present invention 
provides amino acid sequences of human transporter peptides and proteins that are related 
to the gamma-aminobutyric-acid receptor subfamily, nucleic acid sequences in the form 
of transcript sequences, cDNA sequences and/or genomic sequences that encode these 
transporter peptides and proteins, nucleic acid variation (allelic information), tissue 
distribution of expression, and information about the closest art known 
protein/peptide/domain that has structural or sequence homology to the transporter of the 
present invention. 

In addition to being previously unknown, the peptides that are provided in the 
present invention are selected based on their ability to be used for the development of 
commercially important products and services. Specifically, the present peptides are 
selected based on homology and/or structural relatedness to known transporter proteins of 
the gamma-aminobutyric-acid receptor subfamily and the expression pattern observed 
Experimental data as provided in Figure 1 indicates expression in the brain.. The art has 
clearly established the commercial importance of members of this family of proteins and 
proteins that have expression patterns similar to that of the present gene. Some of the 
more specific features of the peptides of the present invention, and the uses thereof, are 
described herein, particularly in the Background of the Invention and in the annotation 
provided in the Figures, and/or are known within the art for each of the known gamma- 
aminobutyric-acid receptor family or subfamily of transporter proteins. 
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S pecific Embodiments 
Peptide Molecules 

The present invention provides nucleic acid sequences that encode protein 
molecules that have been identified as being members of the transporter family of 

5 proteins and are related to the gamma-aminobutyric-acid receptor subfamily (protein 
sequences are provided in Figure 2, transcript/cDNA sequences are provided in Figures 1 
and genomic sequences are provided in Figure 3). The peptide sequences provided in 
Figure 2, as well as the obvious variants described herein, particularly allelic variants as 
identified herein and using the information in Figure 3, will be referred herein as the 

m transporter peptides of the present invention, transporter peptides, or peptides/proteins of 

u the present invention. 

;2 The present invention provides isolated peptide and protein molecules that consist 

U1 of, consist essentially of, or comprising the amino acid sequences of the transporter 
J~ : peptides disclosed in the Figure 2, (encoded by the nucleic acid molecule shown in 
ft Figure 1, transcript/cDNA or Figure 3, genomic sequence), as well as all obvious variants 
RJ of these peptides that are within the art to make and use. Some of these variants are 
2 described in detail below. 

H= As used herein, a peptide is said to be "isolated" or "purified" when it is 

substantially free of cellular material or free of chemical precursors or other chemicals. 

20 The peptides of the present invention can be purified to homogeneity or other degrees of 
purity. The level of purification will be based on the intended use. The critical feature is 
that the preparation allows for the desired function of the peptide, even if in the presence of 
considerable amounts of other components (the features of an isolated nucleic acid molecule 
is discussed below). 

25 In some uses, "substantially free of cellular material" includes preparations of the 

peptide having less than about 30% (by dry weight) other proteins (i.e., contaminating 
protein), less than about 20% other proteins, less than about 10% other proteins, or less than 
about 5% other proteins. When the peptide is recombinantly produced, it can also be 
substantially free of culture medium, i.e., culture medium represents less than about 20% of 

30 the volume of the protein preparation. 
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The language "substantially free of chemical precursors or other chemicals" includes 
preparations of the peptide in which it is separated from chemical precursors or other 
chemicals that are involved in its synthesis. In one embodiment, the language "substantially 
free of chemical precursors or other chemicals'* includes preparations of the transporter 
5 peptide having less than about 30% (by dry weight) chemical precursors or other chemicals, 
less than about 20% chemical precursors or other chemicals, less than about 10% chemical 
precursors or other chemicals, or less than about 5% chemical precursors or other chemicals. 

The isolated transporter peptide can be purified from cells that naturally express it, 
purified from cells that have been altered to express it (recombinant), or synthesized using 
W known protein synthesis methods. Experimental data as provided in Figure 1 indicates 
1 expression in the brain. For example, a nucleic acid molecule encoding the transporter 
H peptide is cloned into an expression vector, the expression vector introduced into a host cell 
S and the protein expressed in the host cell. The protein can then be isolated from the cells by 
F! an appropriate purification scheme using standard protein purification techniques. Many of 
1 5 these techniques are described in detail below. 

M Accordingly, the present invention provides proteins that consist of the amino acid 

W sequences provided in Figure 2 (SEQ ID NO:2), for example, proteins encoded by the 
| transcript/cDNA nucleic acid sequences shown in Figure 1 (SEQ ID NO: 1) and the genomic 
K sequences provided in Figure 3 (SEQ ID NO:3). The amino acid sequence of such a protein 
20 is provided in Figure 2. A protein consists of an amino acid sequence when the amino acid 

sequence is the final amino acid sequence of the protein. 

The present invention further provides proteins that consist essentially of the amino 

acid sequences provided in Figure 2 (SEQ ID NO:2), for example, proteins encoded by the 

transcript/cDNA nucleic acid sequences shown in Figure 1 (SEQ ID NO: 1) and the genomic 
25 sequences provided in Figure 3 (SEQ ID NO:3). A protein consists essentially of an amino 

acid sequence when such an amino acid sequence is present with only a few additional 

amino acid residues, for example from about 1 to about 100 or so additional residues, 

typically from 1 to about 20 additional residues in the final protein. 

The present invention further provides proteins that comprise the amino acid 
30 sequences provided in Figure 2 (SEQ ID NO:2), for example, proteins encoded by the 

transcript/cDNA nucleic acid sequences shown in Figure 1 (SEQ ID NO:l) and the genomic 
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sequences provided in Figure 3 (SEQ ID NO:3). A protein comprises an amino acid 
sequence when the amino acid sequence is at least part of the final amino acid sequence of 
the protein. In such a fashion, the protein can be only the peptide or have additional amino 
acid molecules, such as amino acid residues (contiguous encoded sequence) that are 
5 naturally associated with it or heterologous amino acid residues/peptide sequences. Such a 
protein can have a few additional amino acid residues or can comprise several hundred or 
more additional amino acids. The preferred classes of proteins that are comprised of the 
transporter peptides of the present invention are the naturally occurring mature proteins. A 
brief description of how various types of these proteins can be made/isolated is provided 
below. 

!ff The transporter peptides of the present invention can be attached to heterologous 

sequences to form chimeric or fusion proteins. Such chimeric and fusion proteins comprise 
m a transporter peptide operatively linked to a heterologous protein having an amino acid 
y : sequence not substantially homologous to the transporter peptide. "Operatively linked" 
1 5 indicates that the transporter peptide and the heterologous protein are fused in-frame. The 
jTs heterologous protein can be fiised to the N-terminus or C^erminus of the transporter 
^ peptide. 

O Li some uses, the fusion protein does not affect the activity of the transporter peptide 

per se. For example, the fusion protein can include, but is not limited to, enzymatic fusion 

20 proteins, for example beta-galactosidase fusions, yeast two-hybrid GAL fusions, poly-His 
fusions, MYC-tagged, Hi-tagged and Ig fusions. Such fusion proteins, particularly poly-His 
fusions, can facilitate the purification of recombinant transporter peptide. In certain host 
cells (e.g., mammalian host cells), expression and/or secretion of a protein can be increased 
by using a heterologous signal sequence. 

25 A chimeric or fusion protein can be produced by standard recombinant DNA 

techniques. For example, DNA fragments coding for the different protein sequences are 
ligated together in-frame in accordance with conventional techniques, hi another 
embodiment, the fusion gene can be synthesized by conventional techniques including 
automated DNA synthesizers. Alternatively, PCR amplification of gene fragments can be 

30 carried out using anchor primers which give rise to complementary overhangs between two 
consecutive gene fragments which can subsequently be annealed and re-amplified to 
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generate a chimeric gene sequence (see Ausubel et aL, Current Protocols in Molecular 
Biology, 1992). Moreover, many expression vectors are commercially available that already 
encode a fusion moiety (e.g., a GST protein). A transporter peptide-encoding nucleic acid 
can be cloned into such an expression vector such that the fusion moiety is linked in-frame 
5 to the transporter peptide. 

As mentioned above, the present invention also provides and enables obvious 
variants of the amino acid sequence of the proteins of the present invention, such as 
naturally occurring mature forms of the peptide, allelic/sequence variants of the peptides, 
non-naturally occurring recombinantly derived variants of the peptides, and orthologs and 
£9 paralogs of the peptides. Such variants can readily be generated using art-known techniques 
fi in the fields of recombinant nucleic acid technology and protein biochemistry. It is 
^ understood, however, that variants exclude any amino acid sequences disclosed prior to the 
ffl invention. 

!: I Such variants can readily be identified/made using molecular techniques and the 

i 5 sequence information disclosed herein. Further, such variants can readily be distinguished 

|7| from other peptides based on sequence and/or structural homology to the transporter 

peptides of the present invention. The degree of homology/identity present will be based 

O primarily on whether the peptide is a functional variant or non-functional variant, the 

amount of divergence present in the paralog family and the evolutionary distance between 

20 the orthologs. 

To determine the percent identity of two amino acid sequences or two nucleic 
acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps 
can be introduced in one or both of a first and a second amino acid or nucleic acid 
sequence for optimal alignment and non-homologous sequences can be disregarded for 

25 comparison purposes). In a preferred embodiment, at least 30%, 40%, 50%, 60%, 70%, 
80%, or 90% or more of a reference sequence is aligned for comparison purposes. The 
amino acid residues or nucleotides at corresponding amino acid positions or nucleotide 
positions are then compared. When a position in the first sequence is occupied by the 
same amino acid residue or nucleotide as the corresponding position in the second 

30 sequence, then the molecules are identical at that position (as used herein amino acid or 
nucleic acid "identity 11 is equivalent to amino acid or nucleic acid "homology"). The 



20 



CL001006-CIP 

generate a chimeric gene sequence (see Ausubel et ai 9 Current Protocols in Molecular 
Biology, 1992). Moreover, many expression vectors are commercially available that already 
encode a fusion moiety (e.g., a GST protein). A transporter peptide-encoding nucleic acid 
can be cloned into such an expression vector such that the fusion moiety is linked in-frame 
5 to the transporter peptide. 

As mentioned above, the present invention also provides and enables obvious 
variants of the amino acid sequence of the proteins of the present invention, such as 
naturally occurring mature forms of the peptide, allelic/sequence variants of the peptides, 
non-naturally occurring recombinantly derived variants of the peptides, and orthologs and 
£5 paralogs of the peptides. Such variants can readily be generated using art-known techniques 
m in the fields of recombinant nucleic acid technology and protein biochemistry. It is 
J understood, however, that variants exclude any amino acid sequences disclosed prior to the 
ffl invention. 

% j Such variants can readily be identified/made using molecular techniques and the 

15 sequence information disclosed herein. Further, such variants can readily be distinguished 
id from other peptides based on sequence and/or structural homology to the transporter 

peptides of the present invention. The degree of homology/identity present will be based 
primarily on whether the peptide is a functional variant or non-functional variant, the 
amount of divergence present in the paralog family and the evolutionary distance between 
20 the orthologs. 

To determine the percent identity of two amino acid sequences or two nucleic 
acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps 
can be introduced in one or both of a first and a second amino acid or nucleic acid 
sequence for optimal alignment and non-homologous sequences can be disregarded for 
25 comparison purposes). In a preferred embodiment, at least 30%, 40%, 50%, 60%, 70%, 
80%, or 90% or more of a reference sequence is aligned for comparison purposes. The 
amino acid residues or nucleotides at corresponding amino acid positions or nucleotide 
positions are then compared. When a position in the first sequence is occupied by the 
same amino acid residue or nucleotide as the corresponding position in the second 
30 sequence, then the molecules are identical at that position (as used herein amino acid or 
nucleic acid "identity" is equivalent to amino acid or nucleic acid "homology"). The 
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percent identity between the two sequences is a function of the number of identical 
positions shared by the sequences, taking into account the number of gaps, and the length 
of each gap, which need to be introduced for optimal alignment of the two sequences. 

The comparison of sequences and determination of percent identity and similarity 
between two sequences can be accomplished using a mathematical algorithm. 
(Computational Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York, 
1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., Academic Press, 
New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, A.M., and Griffin, 
H.G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von 
Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and 
Devereux, J., eds., M Stockton Press, New York, 1991). In a preferred embodiment, the 
percent identity between two amino acid sequences is determined using the Needleman 
and Wunsch (J. Mol Biol. (48):444-453 (1970)) algorithm which has been incorporated 
into the GAP program in the GCG software package (available at http://www.gcg.com), 
using either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 
10, 8, 6, or 4 and a length weight of 1 , 2, 3, 4, 5, or 6. In yet another preferred 
embodiment, the percent identity between two nucleotide sequences is determined using 
the GAP program in the GCG software package (Devereux, J., et al, Nucleic Acids Res. 
12(1):3S7 (1984)) (available at http://www.gcg.com), using aNWSgapdna.CMP matrix 
and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. In 
another embodiment, the percent identity between two amino acid or nucleotide 
sequences is determined using the algorithm of E. Myers and W. Miller (CABIOS, 4:11- 
17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a 
PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. 

The nucleic acid and protein sequences of the present invention can further be 
used as a "query sequence" to perform a search against sequence databases to, for 
example, identify other family members or related sequences. Such searches can be 
performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (J. 
Mol. Biol. 215:403-10 (1990)). BLAST nucleotide searches can be performed with the 
NBLAST program, score = 100, wordlength = 12 to obtain nucleotide sequences 
homologous to the nucleic acid molecules of the invention. BLAST protein searches can 
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be performed with the XBLAST program, score = 50, wordlength = 3 to obtain amino 
acid sequences homologous to the proteins of the invention. To obtain gapped 
alignments for comparison purposes, Gapped BLAST can be utilized as described in 
Altschulefa/. (Nucleic Acids Res. 25(17):3389-3402 (1997)). When utilizing BLAST 
and gapped BLAST programs, the default parameters of the respective programs (e.g., 
XBLAST and NBLAST) can be used. 

Full-length pre-processed forms, as well as mature processed forms, of proteins that 
comprise one of the peptides of the present invention can readily be identified as having 
complete sequence identity to one of the transporter peptides of the present invention as well 
as being encoded by the same genetic locus as the transporter peptide provided herein. As 
indicated by the data presented in Figure 3, the map position was determined to be on 
chromosome 4 by ePCR, and confirmed with radiation hybrid mapping. 

Allelic variants of a transporter peptide can readily be identified as being a human 
protein having a high degree (significant) of sequence homology/identity to at least a portion 
of the transporter peptide as well as being encoded by the same genetic locus as the 
transporter peptide provided herein. Genetic locus can readily be determined based on the 
genomic information provided in Figure 3, such as the genomic sequence mapped to the 
reference human. As indicated by the data presented in Figure 3, the map position was 
determined to be on chromosome 4 by ePCR, and confirmed with radiation hybrid mapping. 
As used herein, two proteins (or a region of the proteins) have significant homology 
when the amino acid sequences are typically at least about 70-80%, 80-90%, and more 
typically at least about 90-95% or more homologous. A significantly homologous amino 
acid sequence, according to the present invention, will be encoded by a nucleic acid 
sequence that will hybridize to a transporter peptide encoding nucleic acid molecule 
under stringent conditions as more fully described below. 

Figure 3 provides information on SNPs that have been identified in a gene 
encoding the transporter protein of the present invention. 1 1 3 SNP variants were found, 
including 32 indels (indicated by a "-")• 

Paralogs of a transporter peptide can readily be identified as having some degree of 
significant sequence homology/identity to at least a portion of the transporter peptide, as 
being encoded by a gene from humans, and as having similar activity or function. Two 
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proteins will typically be considered paralogs when the amino acid sequences are 
typically at least about 60% or greater, and more typically at least about 70% or greater 
homology through a given region or domain. Such paralogs will be encoded by a nucleic 
acid sequence that will hybridize to a transporter peptide encoding nucleic acid molecule 
5 under moderate to stringent conditions as more fully described below. 

Orthologs of a transporter peptide can readily be identified as having some degree of 
significant sequence homology/identity to at least a portion of the transporter peptide as well 
as being encoded by a gene from another organism. Preferred orthologs will be isolated 
from mammals, preferably primates, for the development of human therapeutic targets and 
m agents. Such orthologs will be encoded by a nucleic acid sequence that will hybridize to a 
3 transporter peptide encoding nucleic acid molecule under moderate to stringent 

conditions, as more fully described below, depending on the degree of relatedness of the 
JP two organisms yielding the proteins. 

■*j Non-naturally occurring variants of the transporter peptides of the present invention 

15 can readily be generated using recombinant techniques. Such variants include, but are not 
W limited to deletions, additions and substitutions in the amino acid sequence of the transporter 
*M peptide. For example, one class of substitutions are conserved amino acid substitution. 
P Such substitutions are those that substitute a given amino acid in a transporter peptide by 

another amino acid of like characteristics. Typically seen as conservative substitutions are 
20 the replacements, one for another, among the aliphatic amino acids Ala, Val, Leu, and He; 
interchange of the hydroxyl residues Ser and Thr; exchange of the acidic residues Asp and 
Glu; substitution between the amide residues Asn and Gin; exchange of the basic residues 
Lys and Arg; and replacements among the aromatic residues Phe and Tyr. Guidance 
concerning which amino acid changes are likely to be phenotypically silent are found in 
25 Bowie et al, Science 247:1306-1310 (1990). 

Variant transporter peptides can be fully functional or can lack function in one or 
more activities, e.g. ability to bind ligand, ability to transport ligand, ability to mediate 
signaling, etc. Fully functional variants typically contain only conservative variation or 
variation in non-critical residues or in non-critical regions. Figure 2 provides the result of 
30 protein analysis and can be used to identify critical domains/regions. Functional variants 
can also contain substitution of similar amino acids that result in no change or an 
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insignificant change in function. Alternatively, such substitutions may positively or 
negatively affect function to some degree. 

Non-functional variants typically contain one or more non-conservative amino acid 
substitutions, deletions, insertions, inversions, or truncation or a substitution, insertion, 
5 inversion, or deletion in a critical residue or critical region. 

Amino acids that are essential for function can be identified by methods known in 
the art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham et 
al, Science 2^:1081-1085 (1989)), particularly using the results provided in Figure 2. The 
latter procedure introduces single alanine mutations at every residue in the molecule. The 
® resulting mutant molecules are then tested for biological activity such as transporter activity 
% or in assays such as an in vitro proliferative activity. Sites that are critical for binding 
^: partner/substrate binding can also be determined by structural analysis such as 
m crystallization, nuclear magnetic resonance or photoaffinity labeling (Smith et al, J. Mol 
? f! Biol 224:899-904 (1992); de Vos et al Science 255:306-312 (1992)). 
15 The present invention further provides fragments of the transporter peptides, in 

:Jj addition to proteins and peptides that comprise and consist of such fragments, particularly 
% those comprising the residues identified in Figure 2. The fragments to which the invention 
□ pertains, however, are not to be construed as encompassing fragments that may be disclosed 

publicly prior to the present invention. 
20 As used herein, a fragment comprises at least 8, 10, 12, 14, 16, or more contiguous 

amino acid residues from a transporter peptide. Such fragments can be chosen based on the 
ability to retain one or more of the biological activities of the transporter peptide or could be 
chosen for the ability to perform a function, e.g. bind a substrate or act as an immunogen. 
Particularly important fragments are biologically active fragments, peptides that are, for 
25 example, about 8 or more amino acids in length. Such fragments will typically comprise a 
domain or motif of the transporter peptide, e.g., active site, a transmembrane domain or a 
substrate-binding domain. Further, possible fragments include, but are not limited to, 
domain or motif containing fragments, soluble peptide fragments, and fragments containing 
immunogenic structures. Predicted domains and functional sites are readily identifiable by 
30 computer programs well known and readily available to those of skill in the art (e.g., 
PROSITE analysis). The results of one such analysis are provided in Figure 2. 
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Polypeptides often contain amino acids other than the 20 amino acids commonly 
referred to as the 20 naturally occurring amino acids. Further, many amino acids, including 
the terminal amino acids, may be modified by natural processes, such as processing and 
other post-translational modifications, or by chemical modification techniques well known 
in the art. Common modifications that occur naturally in transporter peptides are described 
in basic texts, detailed monographs, and the research literature, and they are well known to 
those of skill in the art (some of these features are identified in Figure 2). 

Known modifications include, but are not limited to, acetylation, acylation, ADP- 
ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme 
moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of 
a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, 
cyclization, disulfide bond formation, demethylation, formation of covalent crosslinks, 
formation of cystine, formation of pyroglutamate, formylation, gamma carboxylation, 
glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, 
myristoylation, oxidation, proteolytic processing, phosphorylation, prenylation, 
racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to 
proteins such as arginylation, and ubiquitination. 

Such modifications are well known to those of skill in the art and have been 
described in great detail in the scientific literature. Several particularly common 
modifications, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic 
acid residues, hydroxylation and ADP-ribosylation, for instance, are described in most basic 
texts, such as Proteins - Structure and Molecular Properties, 2nd Ed., T.E. Creighton, W. H. 
Freeman and Company, New York (1993). Many detailed reviews are available on this 
subject, such as by Wold, F., Posttranslational Covalent Modification of Proteins, B.C. 
Johnson, Ed., Academic Press, New York 1-12 (1983); Seifter et al (Metk Enzymol 182: 
626-646 (1990)) and Rattan etal (Ann. KY. Acad Sci 663:48-62 (1992)). 

Accordingly, the transporter peptides of the present invention also encompass 
derivatives or analogs in which a substituted amino acid residue is not one encoded by the 
genetic code, in which a substituent group is included, in which the mature transporter 
peptide is fused with another compound, such as a compound to increase the half-life of the 
transporter peptide (for example, polyethylene glycol), or in which the additional amino 
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acids are fused to the mature transporter peptide, such as a leader or secretory sequence or a 
sequence for purification of the mature transporter peptide or a pro-protein sequence. 

Protein/Peptide Uses 

5 The proteins of the present invention can be used in substantial and specific 

assays related to the functional information provided in the Figures; to raise antibodies or 
to elicit another immune response; as a reagent (including the labeled reagent) in assays 
designed to quantitatively determine levels of the protein (or its binding partner or ligand) 
^ in biological fluids; and as markers for tissues in which the corresponding protein is 
ifl preferentially expressed (either constitutively or at a particular stage of tissue 
y : differentiation or development or in a disease state). Where the protein binds or 
if potentially binds to another protein or ligand (such as, for example, in a transporter- 
Ul effector protein interaction or transporter-ligand interaction), the protein can be used to 
FJ " identify the binding partner/ligand so as to develop a system to identify inhibitors of the 
Jif binding interaction. Any or all of these uses are capable of being developed into reagent 
III grade or kit format for commercialization as commercial products, 
fi Methods for performing the uses listed above are well known to those skilled in 

^ the art. References disclosing such methods include "Molecular Cloning: A Laboratory 

Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch and T. 
20 Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

Substantial chemical and structural homology exists between the gamma- 
aminobutyric-acid receptor protein described herein and rat, mouse, bovine, and chicken 
GABAA receptor (see Figure 1). As discussed in the background, there GABAA 
25 receptors are known in the art to be involved in medicating neuronal inhibition by 

GABA, and are the major inhibitory neurotransmitter receptors. Subsequently, they may 
play a role in human genetic predisposition to the development of alcoholism, as well as 
in the maintenance of psychomotor disease states, particularly schizophrenia. 
Accordingly, the gamma-aminobutyric-acid receptor protein, and the encoding gene, 
30 provided by the present invention is useful for treating, preventing, and/or diagnosing 
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alcoholism, psychomotor diseases such as schizophrenia and other disorders associated 
with neurotransmitter receptors. 

The potential uses of the peptides of the present invention are based primarily on 
the source of the protein as well as the class/action of the protein. For example, 
5 transporters isolated from humans and their human/mammalian orthologs serve as targets 
for identifying agents for use in mammalian therapeutic applications, e.g. a human drug, 
particularly in modulating a biological or pathological response in a cell or tissue that 
expresses the transporter. Experimental data as provided in Figure 1 indicates that 
transporter proteins of the present invention are expressed in the brain. Specifically, a 
f§ virtual northern blot shows expression in the human and fetal brain. In addition, PCR- 
2 based tissue screening panel confirms expression in human fetal brain. A large 
H percentage of pharmaceutical agents are being developed that modulate the activity of 
5 transporter proteins, particularly members of the gamma-aminobutyric-acid receptor 
^ subfamily (see Background of the Invention). The structural and functional information 
A 5 provided in the Background and Figures provide specific and substantial uses for the 
Ti molecules of the present invention, particularly in combination with the expression 
;J information provided in Figure 1 . Experimental data as provided in Figure 1 indicates 
□ expression in the brain. Such uses can readily be determined using the information 

provided herein, that known in the art and routine experimentation. 
20 The proteins of the present invention (including variants and fragments that may 

have been disclosed prior to the present invention) are useful for biological assays related to 
transporters that are related to members of the gamma-aminobutyric-acid receptor 
subfamily. Such assays involve any of the known transporter functions or activities or 
properties useful for diagnosis and treatment of transporter-related conditions that are 
25 specific for the subfamily of transporters that the one of the present invention belongs to, 
particularly in cells and tissues that express the transporter. Experimental data as provided 
in Figure 1 indicates that transporter proteins of the present invention are expressed in the 
brain. Specifically, a virtual northern blot shows expression in the human and fetal brain. In 
addition, PCR-based tissue screening panel confirms expression in human fetal brain. 
30 The proteins of the present invention are also useful in drug screening assays, in cell- 

based or cell-free systems ((Hodgson, Bio/technology, 1992, Sept 10(9);973-80). Cell- 
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based systems can be native, i.e., cells that normally express the transporter, as a biopsy or 
expanded in cell culture. Experimental data as provided in Figure 1 indicates expression in 
the brain. In an alternate embodiment, cell-based assays involve recombinant host cells 
expressing the transporter protein. 
5 The polypeptides can be used to identify compounds that modulate transporter 

activity of the protein in its natural state or an altered form that causes a specific disease or 
pathology associated with the transporter. Both the transporters of the present invention and 
appropriate variants and fragments can be used in high-throughput screens to assay 
candidate compounds for the ability to bind to the transporter. These compounds can be 
11 further screened against a functional transporter to determine the effect of the compound on 
m the transporter activity. Further, these compounds can be tested in animal or invertebrate 
t: systems to determine activity/effectiveness. Compounds can be identified that activate 
ffl (agonist) or inactivate (antagonist) the transporter to a desired degree. 
Z \ Further, the proteins of the present invention can be used to screen a compound for 

15 the ability to stimulate or inhibit interaction between the transporter protein and a molecule 
ill that normally interacts with the transporter protein, e.g. a substrate or a component of the 
m signal pathway that the transporter protein normally interacts (for example, another 
p transporter). Such assays typically include the steps of combining the transporter protein 

with a candidate compound under conditions that allow the transporter protein, or fragment, 
20 to interact with the target molecule, and to detect the formation of a complex between the 
protein and the target or to detect the biochemical consequence of the interaction with the 
transporter protein and the target, such as any of the associated effects of signal transduction 
such as changes in membrane potential, protein phosphorylation, cAMP turnover, and 
adenylate cyclase activation, etc. 
25 Candidate compounds include, for example, 1 ) peptides such as soluble peptides, 

including Ig-tailed fusion peptides and members of random peptide libraries (see, e.g., Lam 
et al, Nature 554:82-84 (1991); Houghten et al, Nature 354:84-86 (1991)) and 
combinatorial chemistry-derived molecular libraries made of D- and/or L- configuration 
amino acids; 2) phosphopeptides (e.g., members of random and partially degenerate, 
30 directed phosphopeptide libraries, see, e.g., Songyang et al, Cell 72:161-11% (1993)); 3) 
antibodies (e.g., polyclonal, monoclonal, humanized, anti-idiotypic, chimeric, and single 
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chain antibodies as well as Fab, F(ab')2, Fab expression library fragments, and epitope- 
binding fragments of antibodies); and 4) small organic and inorganic molecules (e.g., 
molecules obtained from combinatorial and natural product libraries). 

One candidate compound is a soluble fragment of the receptor that competes for 
5 ligand binding. Other candidate compounds include mutant transporters or appropriate 
fragments containing mutations that affect transporter function and thus compete for ligand. 
Accordingly, a fragment that competes for ligand, for example with a higher affinity, or a 
fragment that binds ligand but does not allow release, is encompassed by the invention. 
The invention further includes other end point assays to identify compounds that 
IQ modulate (stimulate or inhibit) transporter activity. The assays typically involve an assay of 
y events in the signal transduction pathway that indicate transporter activity. Thus, the 
p transport of a ligand, change in cell membrane potential, activation of a protein, a change in 
;i: the expression of genes that are up- or down-regulated in response to the transporter protein 
^ dependent signal cascade can be assayed. 

: 15 Any of the biological or biochemical functions mediated by the transporter can be 

H used as an endpoint assay. These include all of the biochemical or biochemical/biological 
events described herein, in the references cited herein, incorporated by reference for these 
fl endpoint assay targets, and other functions known to those of ordinary skill in the art or that 
^ can be readily identified using the information provided in the Figures, particularly Figure 2. 
20 Specifically, a biological function of a cell or tissues that expresses the transporter can be 
assayed. Experimental data as provided in Figure 1 indicates that transporter proteins of the 
present invention are expressed in the brain. Specifically, a virtual northern blot shows 
expression in the human and fetal brain. In addition, PCR-based tissue screening panel 
confirms expression in human fetal brain. 
25 Binding and/or activating compounds can also be screened by using chimeric 

transporter proteins in which the amino terminal extracellular domain, or parts thereof, the 
entire transmembrane domain or subregions, such as any of the seven transmembrane 
segments or any of the intracellular or extracellular loops and the carboxy terminal 
intracellular domain, or parts thereof, can be replaced by heterologous domains or 
30 subregions. For example, a ligand-binding region can be used that interacts with a different 
ligand then that which is recognized by the native transporter. Accordingly, a different set 
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of signal transduction components is available as an end-point assay for activation. This 
allows for assays to be performed in other than the specific host cell from which the 
transporter is derived. 

The proteins of the present invention are also useful in competition binding assays in 
5 methods designed to discover compounds that interact with the transporter (e.g. binding 
partners and/or ligands). Thus, a compound is exposed to a transporter polypeptide under 
conditions that allow the compound to bind or to otherwise interact with the polypeptide. 
Soluble transporter polypeptide is also added to the mixture. If the test compound interacts 
with the soluble transporter polypeptide, it decreases the amount of complex formed or 
tf activity from the transporter target. This type of assay is particularly useful in cases in 
y which compounds are sought that interact with specific regions of the transporter. Thus, the 
H L soluble polypeptide that competes with the target transporter region is designed to contain 
S peptide sequences corresponding to the region of interest. 

To perform cell free drug screening assays, it is sometimes desirable to immobilize 
4 5 either the transporter protein, or fragment, or its target molecule to facilitate separation of 
"1 complexes from uncomplexed forms of one or both of the proteins, as well as to 
;jf accommodate automation of the assay. 

O Techniques for immobilizing proteins on matrices can be used in the drug screening 

assays. In one embodiment, a fusion protein can be provided which adds a domain that 

20 allows the protein to be bound to a matrix. For example, glutathione-S-transferase fusion 
proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, 
MO) or glutathione derivatized microtitre plates, which are then combined with the cell 
lysates (e.g., 35 S-labeled) and the candidate compound, and the mixture incubated under 
conditions conducive to complex formation (e.g., at physiological conditions for salt and 

25 pH). Following incubation, the beads are washed to remove any unbound label, and the 
matrix immobilized and radiolabel determined directly, or in the supernatant after the 
complexes are dissociated. Alternatively, the complexes can be dissociated from the matrix, 
separated by SDS-PAGE, and the level of transporter-binding protein found in the bead 
fraction quantitated from the gel using standard electrophoretic techniques. For example, 

30 either the polypeptide or its target molecule can be immobilized utilizing conjugation of 
biotin and streptavidin using techniques well known in the art. Alternatively, antibodies 
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reactive with the protein but which do not interfere with binding of the protein to its target 
molecule can be derivatized to the wells of the plate, and the protein trapped in the wells by 
antibody conjugation. Preparations of a transporter-binding protein and a candidate 
compound are incubated in the transporter protein-presenting wells and the amount of 
5 complex trapped in the well can be quantitated. Methods for detecting such complexes, in 
addition to those described above for the GST-immobilized complexes, include 
immunodetection of complexes using antibodies reactive with the transporter protein target 
molecule, or which are reactive with transporter protein and compete with the target 
molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity 
10 associated with the target molecule. 

4~ Agents that modulate one of the transporters of the present invention can be 

y. identified using one or more of the above assays, alone or in combination. It is generally 
2 preferable to use a cell-based or cell free system first and then confirm activity in an animal 
111 or other model system. Such model systems are well known in the art and can readily be 
; 1 5 employed in this context. 

p. Modulators of transporter protein activity identified according to these drug 

!y screening assays can be used to treat a subject with a disorder mediated by the transporter 

S pathway, by treating cells or tissues that express the transporter. Experimental data as 

provided in Figure 1 indicates expression in the brain. These methods of treatment include 
20 the steps of administering a modulator of transporter activity in a pharmaceutical 

composition to a subject in need of such treatment, the modulator being identified as 

described herein. 

In yet another aspect of the invention, the transporter proteins can be used as "bait 
proteins" in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Patent No. 

25 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 

268:12046-12054; Battel etal. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) 
Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind 
to or interact with the transporter and are involved in transporter activity. Such 
transporter-binding proteins are also likely to be involved in the propagation of signals by 

30 the transporter proteins or transporter targets as, for example, downstream elements of a 
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transporter-mediated signaling pathway. Alternatively, such transporter-binding proteins 
are likely to be transporter inhibitors. 

The two-hybrid system is based on the modular nature of most transcription 
factors, which consist of separable DNA-binding and activation domains. Briefly, the 
5 assay utilizes two different DNA constructs. In one construct, the gene that codes for a 
transporter protein is fused to a gene encoding the DNA binding domain of a known 
transcription factor (e.g., GAL-4). In the other construct, a DNA sequence, from a library 
of DNA sequences, that encodes an unidentified protein ("prey" or "sample") is fused to a 
gene that codes for the activation domain of the known transcription factor. If the "bait" 
If and the "prey" proteins are able to interact, in vivo, forming a transporter-dependent 
pi complex, the DNA-binding and activation domains of the transcription factor are brought 
■£T into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) 
01 which is operably linked to a transcriptional regulatory site responsive to the transcription 
Zj factor. Expression of the reporter gene can be detected and cell colonies containing the 
|5 functional transcription factor can be isolated and used to obtain the cloned gene which 
y encodes the protein which interacts with the transporter protein. 

^ This invention further pertains to novel agents identified by the above-described 

O screening assays. Accordingly, it is within the scope of this invention to further use an 
agent identified as described herein in an appropriate animal model. For example, an 

20 agent identified as described herein (e.g., a transporter-modulating agent, an antisense 
transporter nucleic acid molecule, a transporter-specific antibody, or a transporter- 
binding partner) can be used in an animal or other model to determine the efficacy, 
toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified 
as described herein can be used in an animal or other model to determine the mechanism 

25 of action of such an agent. Furthermore, this invention pertains to uses of novel agents 
identified by the above-described screening assays for treatments as described herein. 

The transporter proteins of the present invention are also useful to provide a target 
for diagnosing a disease or predisposition to disease mediated by the peptide. Accordingly, 
the invention provides methods for detecting the presence, or levels of, the protein (or 

30 encoding mRNA) in a cell, tissue, or organism. Experimental data as provided in Figure 1 
indicates expression in the brain. The method involves contacting a biological sample with 
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a compound capable of interacting with the transporter protein such that the interaction can 
be detected. Such an assay can be provided in a single detection format or a multi-detection 
format such as an antibody chip array. 

One agent for detecting a protein in a sample is an antibody capable of selectively 
5 binding to protein. A biological sample includes tissues, cells and biological fluids isolated 
from a subject, as well as tissues, cells and fluids present within a subject. 

The peptides of the present invention also provide targets for diagnosing active 
protein activity, disease, or predisposition to disease, in a patient having a variant peptide, 
particularly activities and conditions that are known for other members of the family of 

0 proteins to which the present one belongs. Thus, the peptide can be isolated from a 

1 biological sample and assayed for the presence of a genetic mutation that results in aberrant 
S peptide. This includes amino acid substitution, deletion, insertion, rearrangement, (as the 
If? result of aberrant splicing events), and inappropriate post-translational modification. 

J! Analytic methods include altered electrophoretic mobility, altered tryptic peptide digest, 
15 altered transporter activity in cell-based or cell-free assay, alteration in ligand or antibody- 
•II binding pattern, altered isoelectric point, direct amino acid sequencing, and any other of the 
J2 known assay techniques useful for detecting mutations in a protein. Such an assay can be 
P provided in a single detection format or a multi-detection format such as an antibody chip 
array. 

20 In vitro techniques for detection of peptide include enzyme linked immunosorbent 

assays (ELISAs), Western blots, immunoprecipitations and immunofluorescence using a 
detection reagent, such as an antibody or protein binding agent. Alternatively, the peptide 
can be detected in vivo in a subject by introducing into the subject a labeled anti-peptide 
antibody or other types of detection agent. For example, the antibody can be labeled with a 

25 radioactive marker whose presence and location in a subj ect can be detected by standard 
imaging techniques. Particularly useful are methods that detect the allelic variant of a 
peptide expressed in a subject and methods which detect fragments of a peptide in a sample. 

The peptides are also useful in pharmacogenomic analysis. Pharmacogenomics deal 
with clinically significant hereditary variations in the response to drugs due to altered drug 

30 disposition and abnormal action in affected persons. See, e.g., Eichelbaum, M. (Clin. Exp. 
Pharmacol. Physiol. 23(10-1 1):983-985 (1996)), and Linder, M.W. (Clin. Chem. 43(2):254- 
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266 (1997)). The clinical outcomes of these variations result in severe toxicity of 
therapeutic drugs in certain individuals or therapeutic failure of drugs in certain individuals 
as a result of individual variation in metabolism. Thus, the genotype of the individual can 
determine the way a therapeutic compound acts on the body or the way the body 
metabolizes the compound. Further, the activity of drug metabolizing enzymes effects both 
the intensity and duration of drug action. Thus, the pharmacogenomics of the individual 
permit the selection of effective compounds and effective dosages of such compounds for 
prophylactic or therapeutic treatment based on the individual's genotype. The discovery of 
genetic polymorphisms in some drug metabolizing enzymes has explained why some 
patients do not obtain the expected drug effects, show an exaggerated drug effect, or 
experience serious toxicity from standard drug dosages. Polymorphisms can be expressed 
in the phenotype of the extensive metabolizer and the phenotype of the poor metabolizer. 
Accordingly, genetic polymorphism may lead to allelic protein variants of the transporter 
protein in which one or more of the transporter functions in one population is different from 
those in another population. The peptides thus allow a target to ascertain a genetic 
predisposition that can affect treatment modality. Thus, in a ligand-based treatment, 
polymorphism may give rise to amino terminal extracellular domains and/or other ligand- 
binding regions that are more or less active in ligand binding, and transporter activation. 
Accordingly, ligand dosage would necessarily be modified to maximize the therapeutic 
effect within a given population containing a polymorphism. As an alternative to 
genotyping, specific polymorphic peptides could be identified. 

The peptides are also useful for treating a disorder characterized by an absence of, 
inappropriate, or unwanted expression of the protein. Experimental data as provided in 
Figure 1 indicates expression in the brain. Accordingly, methods for treatment include the 
use of the transporter protein or fragments. 

Antibodies 

The invention also provides antibodies that selectively bind to one of the peptides of 
the present invention, a protein comprising such a peptide, as well as variants and fragments 
thereof. As used herein, an antibody selectively binds a target peptide when it binds the 
target peptide and does not significantly bind to unrelated proteins. An antibody is still 
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considered to selectively bind a peptide even if it also binds to other proteins that are not 
substantially homologous with the target peptide so long as such proteins share homology 
with a fragment or domain of the peptide target of the antibody. In this case, it would be 
understood that antibody binding to the peptide is still selective despite some degree of 
5 cross-reactivity. 

As used herein, an antibody is defined in terms consistent with that recognized 
within the art: they are multi-subunit proteins produced by a mammalian organism in 
response to an antigen challenge. The antibodies of the present invention include polyclonal 
antibodies and monoclonal antibodies, as well as fragments of such antibodies, including, 

£D but not limited to, Fab or F(ab')2, and Fv fragments. 

y.j 

03 Many methods are known for generating and/or identifying antibodies to a given 

iZ target peptide. Several such methods are described by Harlow, Antibodies, Cold Spring 
Ji Harbor Press, (1989). 

sj In general, to generate antibodies, an isolated peptide is used as an immunogen and 

15 is administered to a mammalian organism, such as a rat, rabbit or mouse. The full-length 
yj protein, an antigenic peptide fragment or a fusion protein can be used. Particularly 
f3 important fragments are those covering functional domains, such as the domains identified 
U i n Figure 2, and domain of sequence homology or divergence amongst the family, such as 

those that can readily be identified using protein alignment methods and as presented in the 
20 Figures. 

Antibodies are preferably prepared from regions or discrete fragments of the 
transporter proteins. Antibodies can be prepared from any region of the peptide as 
described herein. However, preferred regions will include those involved in 
function/activity and/or transporter/binding partner interaction. Figure 2 can be used to 
25 identify particularly important regions while sequence alignment can be used to identify 
conserved and unique sequence fragments. 

An antigenic fragment will typically comprise at least 8 contiguous amino acid 
residues. The antigenic peptide can comprise, however, at least 10, 12, 14, 16 or more 
amino acid residues. Such fragments can be selected on a physical property, such as 
30 fragments correspond to regions that are located on the surface of the protein, e.g., 
hydrophilic regions or can be selected based on sequence uniqueness (see Figure 2). 
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Detection on an antibody of the present invention can be facilitated by coupling (i.e., 
physically linking) the antibody to a detectable substance. Examples of detectable 
substances include various enzymes, prosthetic groups, fluorescent materials, luminescent 
materials, bioluminescent materials, and radioactive materials. Examples of suitable 
5 enzymes include horseradish peroxidase, alkaline phosphatase, (3-galactosidase, or 
acetylcholinesterase; examples of suitable prosthetic group complexes include 
streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include 
umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine 
fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes 
© luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, 
m and examples of suitable radioactive material include 125 1, 13 1 1, 35 S or J H. 

H: Antibody Uses 

%l The antibodies can be used to isolate one of the proteins of the present invention by 

SS standard techniques, such as affinity chromatography or immunoprecipitation. The 
ill antibodies can facilitate the purification of the natural protein from cells and recombinant^ 
JS? produced protein expressed in host cells. In addition, such antibodies are useful to detect the 
presence of one of the proteins of the present invention in cells or tissues to determine the 
pattern of expression of the protein among various tissues in an organism and over the 
20 course of normal development. Experimental data as provided in Figure 1 indicates that 
transporter proteins of the present invention are expressed in the brain. Specifically, a 
virtual northern blot shows expression in the human and fetal brain. In addition, PCR-based 
tissue screening panel confirms expression in human fetal brain. Further, such antibodies 
can be used to detect protein in situ, in vitro, or in a cell lysate or supernatant in order to 
25 evaluate the abundance and pattern of expression. Also, such antibodies can be used to 
assess abnormal tissue distribution or abnormal expression during development or 
progression of a biological condition. Antibody detection of circulating fragments of the 
full length protein can be used to identify turnover. 

Further, the antibodies can be used to assess expression in disease states such as in 
30 active stages of the disease or in an individual with a predisposition toward disease related to 
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the protein's function. When a disorder is caused by an inappropriate tissue distribution, 
developmental expression, level of expression of the protein, or expressed/processed form, 
the antibody can be prepared against the normal protein. Experimental data as provided in 
Figure 1 indicates expression in the brain. If a disorder is characterized by a specific 
mutation in the protein, antibodies specific for this mutant protein can be used to assay for 
the presence of the specific mutant protein. 

The antibodies can also be used to assess normal and aberrant subcellular 
localization of cells in the various tissues in an organism. Experimental data as provided in 
Figure 1 indicates expression in the brain. The diagnostic uses can be applied, not only in 
genetic testing, but also in monitoring a treatment modality. Accordingly, where treatment 
is ultimately aimed at correcting expression level or the presence of aberrant sequence and 
aberrant tissue distribution or developmental expression, antibodies directed against the 
protein or relevant fragments can be used to monitor therapeutic efficacy. 

Additionally, antibodies are useful in pharmacogenomic analysis. Thus, antibodies 
prepared against polymorphic proteins can be used to identify individuals that require 
modified treatment modalities. The antibodies are also useful as diagnostic tools as an 
immunological marker for aberrant protein analyzed by electrophoretic mobility, isoelectric 
point, tryptic peptide digest, and other physical assays known to those in the art. 

The antibodies are also useful for tissue typing. Experimental data as provided in 
Figure 1 indicates expression in the brain. Thus, where a specific protein has been 
correlated with expression in a specific tissue, antibodies that are specific for this protein can 
be used to identify a tissue type. 

The antibodies are also useful for inhibiting protein function, for example, blocking 
the binding of the transporter peptide to a binding partner such as a ligand or protein binding 
partner. These uses can also be applied in a therapeutic context in which treatment involves 
inhibiting the protein's function. An antibody can be used, for example, to block binding, 
thus modulating (agonizing or antagonizing) the peptides activity. Antibodies can be 
prepared against specific fragments containing sites required for function or against intact 
protein that is associated with a cell or cell membrane. See Figure 2 for structural 
information relating to the proteins of the present invention. 
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The invention also encompasses kits for using antibodies to detect the presence of a 
protein in a biological sample. The kit can comprise antibodies such as a labeled or 
labelable antibody and a compound or agent for detecting protein in a biological sample; 
means for determining the amount of protein in the sample; means for comparing the 
5 amount of protein in the sample with a standard; and instructions for use. Such a kit can be 
supplied to detect a single protein or epitope or can be configured to detect one of a 
multitude of epitopes, such as in an antibody detection array. Arrays are described in detail 
below for nucleic acid arrays and similar methods have been developed for antibody arrays. 

y> Nucleic Acid Molecules 

144 The present invention further provides isolated nucleic acid molecules that encode a 

ffi transporter peptide or protein of the present invention (cDNA, transcript and genomic 

If! sequence). Such nucleic acid molecules will consist of, consist essentially of, or comprise a 

4 nucleotide sequence that encodes one of the transporter peptides of the present invention, an 

05 allelic variant thereof, or an ortholog or paralog thereof. 

*j As used herein, an "isolated" nucleic acid molecule is one that is separated from 

2 other nucleic acid present in the natural source of the nucleic acid. Preferably, an "isolated" 
;;i nucleic acid is free of sequences that naturally flank the nucleic acid (i.e., sequences located 

at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the organism from which 
20 the nucleic acid is derived. However, there can be some flanking nucleotide sequences, for 
example up to about 5KB, 4KB, 3KB, 2KB, or 1KB or less, particularly contiguous peptide 
encoding sequences and peptide encoding sequences within the same gene but separated by 
introns in the genomic sequence. The important point is that the nucleic acid is isolated 
from remote and unimportant flanking sequences such that it can be subjected to the specific 
25 manipulations described herein such as recombinant expression, preparation of probes and 
primers, and other uses specific to the nucleic acid sequences. 

Moreover, an "isolated" nucleic acid molecule, such as a transcript/cDNA molecule, 
can be substantially free of other cellular material, or culture medium when produced by 
recombinant techniques, or chemical precursors or other chemicals when chemically 
30 synthesized. However, the nucleic acid molecule can be fused to other coding or regulatory 
sequences and still be considered isolated. 
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For example, recombinant DNA molecules contained in a vector are considered 
isolated. Further examples of isolated DNA molecules include recombinant DNA 
molecules maintained in heterologous host cells or purified (partially or substantially) DNA 
molecules in solution. Isolated RNA molecules include in vivo or in vitro RNA transcripts 
5 of the isolated DNA molecules of the present invention. Isolated nucleic acid molecules 
according to the present invention further include such molecules produced synthetically. 

Accordingly, the present invention provides nucleic acid molecules that consist of 
the nucleotide sequence shown in Figure 1 or 3 (SEQ ID NO: 1, transcript sequence and 
SEQ ID NO:3, genomic sequence), or any nucleic acid molecule that encodes the protein 
I® provided in Figure 2, SEQ ID NO:2. A nucleic acid molecule consists of a nucleotide 
S sequence when the nucleotide sequence is the complete nucleotide sequence of the nucleic 
^ acid molecule. 

m The present invention further provides nucleic acid molecules that consist essentially 

'?] of the nucleotide sequence shown in Figure 1 or 3 (SEQ ID NO :1, transcript sequence and 

i 5 SEQ ID NO:3, genomic sequence), or any nucleic acid molecule that encodes the protein 
: ] provided in Figure 2, SEQ ID NO:2. A nucleic acid molecule consists essentially of a 

nucleotide sequence when such a nucleotide sequence is present with only a few additional 

D nucleic acid residues in the final nucleic acid molecule. 

The present invention further provides nucleic acid molecules that comprise the 

20 nucleotide sequences shown in Figure 1 or 3 (SEQ ID NO:l, transcript sequence and SEQ 
ID NO:3, genomic sequence), or any nucleic acid molecule that encodes the protein 
provided in Figure 2, SEQ ID NO:2. A nucleic acid molecule comprises a nucleotide 
sequence when the nucleotide sequence is at least part of the final nucleotide sequence of 
the nucleic acid molecule. In such a fashion, the nucleic acid molecule can be only the 

25 nucleotide sequence or have additional nucleic acid residues, such as nucleic acid residues 
that are naturally associated with it or heterologous nucleotide sequences. Such a nucleic 
acid molecule can have a few additional nucleotides or can comprise several hundred or 
more additional nucleotides. A brief description of how various types of these nucleic acid 
molecules can be readily made/isolated is provided below. 

30 In Figures 1 and 3, both coding and non-coding sequences are provided. Because 

of the source of the present invention, humans genomic sequence (Figure 3) and 
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cDNA/transcript sequences (Figure 1), the nucleic acid molecules in the Figures will 
contain genomic intronic sequences, 5' and 3' non-coding sequences, gene regulatory 
regions and non-coding intergenic sequences. In general such sequence features are 
either noted in Figures 1 and 3 or can readily be identified using computational tools 
5 known in the art. As discussed below, some of the non-coding regions, particularly gene 
regulatory elements such as promoters, are useful for a variety of purposes, e.g. control of 
heterologous gene expression, target for identifying gene activity modulating compounds, 
and are particularly claimed as fragments of the genomic sequence provided herein. 

The isolated nucleic acid molecules can encode the mature protein plus additional 
m amino or carboxyl-terminal amino acids, or amino acids interior to the mature peptide (when 
S the mature form has more than one peptide chain, for instance). Such sequences may play a 
^ role in processing of a protein from precursor to a mature form, facilitate protein trafficking, 
m prolong or shorten protein half-life or facilitate manipulation of a protein for assay or 
P production, among other things. As generally is the case in situ, the additional amino acids 
-| 5 may be processed away from the mature protein by cellular enzymes. 
u\ As mentioned above, the isolated nucleic acid molecules include, but are not limited 

2 to, the sequence encoding the transporter peptide alone, the sequence encoding the mature 

3 peptide and additional coding sequences, such as a leader or secretory sequence (e.g., a pre- 
? ~ pro or pro-protein sequence), the sequence encoding the mature peptide, with or without the 
20 additional coding sequences, plus additional non-coding sequences, for example introns and 

non-coding 5' and 3' sequences such as transcribed but non-translated sequences that play a 
role in transcription, mRNA processing (including splicing and polyadenylation signals), 
ribosome binding and stability of mRNA. In addition, the nucleic acid molecule may be 
fused to a marker sequence encoding, for example, a peptide that facilitates purification. 

25 Isolated nucleic acid molecules can be in the form of RNA, such as mRNA, or in the 

form DNA, including cDNA and genomic DNA obtained by cloning or produced by 
chemical synthetic techniques or by a combination thereof. The nucleic acid, especially 
DNA, can be double-stranded or single-stranded. Single-stranded nucleic acid can be the 
coding strand (sense strand) or the non-coding strand (anti-sense strand). 

30 The invention further provides nucleic acid molecules that encode fragments of the 

peptides of the present invention as well as nucleic acid molecules that encode obvious 
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Other methods for detecting mutations in the gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in RNA/KNA or 
RNA/DNA duplexes (Myers et al , Science 230: 1 242 (1985)); Cotton et al , PNAS 55:4397 
(1988); Saleeba et al, Meth Enzymol 277:286-295 (1992)), electrophoretic mobility of 
5 mutant and wild type nucleic acid is compared (Orita et al , PNAS 86:2766 (1 989); Cotton et 
al.Mutat. Res. 255:125-144 (1993); and Hayashi et al, Genet Anal Tech. Appl 9:73-79 
(1992)), and movement of mutant or wild-type fragments in polyacrylamide gels containing 
a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (Myers et 
al, Nature 573:495 (1985)). Examples of other techniques for detecting point mutations 
M) include selective oligonucleotide hybridization, selective amplification, and selective primer 
extension. 

H The nucleic acid molecules are also useful for testing an individual for a genotype 

m that while not necessarily causing the disease, nevertheless affects the treatment modality. 

^ I Thus, the nucleic acid molecules can be used to study the relationship between an 

4 5 individual's genotype and the individual's response to a compound used for treatment 

Q (pharmacogenomic relationship). Accordingly, the nucleic acid molecules described herein 

jl* can be used to assess the mutation content of the transporter gene in an individual in order to 

O select an appropriate compound or dosage regimen for treatment. Figure 3 provides 

information on SNPs that have been identified in a gene encoding the transporter protein of 
20 the present invention. 1 1 3 SNP variants were found, including 32 indels (indicated by a "- 

Thus nucleic acid molecules displaying genetic variations that affect treatment 
provide a diagnostic target that can be used to tailor treatment in an individual. 
Accordingly, the production of recombinant cells and animals containing these 
25 polymorphisms allow effective clinical design of treatment compounds and dosage 
regimens. 

The nucleic acid molecules are thus useful as antisense constructs to control 
transporter gene expression in cells, tissues, and organisms. A DNA antisense nucleic acid 
molecule is designed to be complementary to a region of the gene involved in transcription, 
30 preventing transcription and hence production of transporter protein. An antisense RNA or 
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at least about 90-95% or more homologous to the nucleotide sequence shown in the Figure 
sheets or a fragment of this sequence. Such nucleic acid molecules can readily be identified 
as being able to hybridize under moderate to stringent conditions, to the nucleotide sequence 
shown in the Figure sheets or a fragment of the sequence. Allelic variants can readily be 
5 determined by genetic locus of the encoding gene. As indicated by the data presented in 
Figure 3, the map position was determined to be on chromosome 4 by ePCR, and confirmed 
with radiation hybrid mapping. 

Figure 3 provides information on SNPs that have been identified in a gene encoding 
the transporter protein of the present invention. 113 SNP variants were found, including 32 
L0 indels (indicated by a "-"). 

yQ As used herein, the term "hybridizes under stringent conditions" is intended to 

r! describe conditions for hybridization and washing under which nucleotide sequences 
CI encoding a peptide at least 60-70% homologous to each other typically remain hybridized to 
fjf! each other. The conditions can be such that sequences at least about 60%, at least about 
fS 70%, or at least about 80% or more homologous to each other typically remain hybridized to 
O each other. Such stringent conditions are known to those skilled in the art and can be found 
m in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. 
2 One example of stringent hybridization conditions are hybridization in 6X sodium 
H chloride/sodium citrate (SSC) at about 45C, followed by one or more washes in 0.2 X SSC, 
20 0.1% SDS at 50-65C. Examples of moderate to low stringency hybridization conditions are 
well known in the art. 

Nucleic Acid Molecule Uses 

The nucleic acid molecules of the present invention are useful for probes, primers, 
25 chemical intermediates, and in biological assays. The nucleic acid molecules are useful as a 
hybridization probe for messenger RNA, transcript/cDNA and genomic DNA to isolate full- 
length cDNA and genomic clones encoding the peptide described in Figure 2 and to isolate 
cDNA and genomic clones that correspond to variants (alleles, orthologs, etc.) producing the 
same or related peptides shown in Figure 2. 113 SNPs, including 32 indels, have been 
30 identified in the gene encoding the transporter protein provided by the present invention and 
are given in Figure 3. 
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The probe can correspond to any sequence along the entire length of the nucleic acid 
molecules provided in the Figures. Accordingly, it could be derived from 5' noncoding 
regions, the coding region, and 3' noncoding regions. However, as discussed, fragments are 
not to be construed as encompassing fragments disclosed prior to the present invention. 

The nucleic acid molecules are also useful as primers for PCR to amplify any given 
region of a nucleic acid molecule and are useful to synthesize antisense molecules of desired 
length and sequence. 

The nucleic acid molecules are also useful for constructing recombinant vectors. 
Such vectors include expression vectors that express a portion of, or all of, the peptide 
sequences. Vectors also include insertion vectors, used to integrate into another nucleic acid 
molecule sequence, such as into the cellular genome, to alter in situ expression of a gene 
and/or gene product. For example, an endogenous coding sequence can be replaced via 
homologous recombination with all or part of the coding region containing one or more 
specifically introduced mutations. 

The nucleic acid molecules are also useful for expressing antigenic portions of the 
proteins. 

The nucleic acid molecules are also useful as probes for determining the 
chromosomal positions of the nucleic acid molecules by means of in situ hybridization 
methods. As indicated by the data presented in Figure 3, the map position was determined 
to be on chromosome 4 by ePCR, and confirmed with radiation hybrid mapping. 

The nucleic acid molecules are also useful in making vectors containing the gene 
regulatory regions of the nucleic acid molecules of the present invention. 

The nucleic acid molecules are also useful for designing ribozymes corresponding to 
all, or a part, of the mRNA produced from the nucleic acid molecules described herein. 

The nucleic acid molecules are also useful for making vectors that express part, or 
all, of the peptides. 

The nucleic acid molecules are also useful for constructing host cells expressing a 
part, or all, of the nucleic acid molecules and peptides. 

The nucleic acid molecules are also useful for constructing transgenic animals 
expressing all, or a part, of the nucleic acid molecules and peptides. 
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The nucleic acid molecules are also useful as hybridization probes for determining 
the presence, level, form and distribution of nucleic acid expression. Experimental data as 
provided in Figure 1 indicates that transporter proteins of the present invention are expressed 
in the brain. Specifically, a virtual northern blot shows expression in the human and fetal 
brain. In addition, PCR-based tissue screening panel confirms expression in human fetal 
brain. 

Accordingly, the probes can be used to detect the presence of, or to determine levels 
of, a specific nucleic acid molecule in cells, tissues, and in organisms. The nucleic acid 
whose level is determined can be DNA or RNA. Accordingly, probes corresponding to the 
peptides described herein can be used to assess expression and/or gene copy number in a 
given cell, tissue, or organism. These uses are relevant for diagnosis of disorders involving 
an increase or decrease in transporter protein expression relative to normal results. 

In vitro techniques for detection of mRNA include Northern hybridizations and in 
situ hybridizations. In vitro techniques for detecting DNA include Southern hybridizations 
and in situ hybridization. 

Probes can be used as a part of a diagnostic test kit for identifying cells or tissues 
that express a transporter protein, such as by measuring a level of a transporter-encoding 
nucleic acid in a sample of cells from a subject e.g., mRNA or genomic DNA, or 
determining if a transporter gene has been mutated. Experimental data as provided in Figure 
1 indicates that transporter proteins of the present invention are expressed in the brain. 
Specifically, a virtual northern blot shows expression in the human and fetal brain. In 
addition, PCR-based tissue screening panel confirms expression in human fetal brain. 

Nucleic acid expression assays are useful for drug screening to identify compounds 
that modulate transporter nucleic acid expression. 

The invention thus provides a method for identifying a compound that can be used 
to treat a disorder associated with nucleic acid expression of the transporter gene, 
particularly biological and pathological processes that are mediated by the transporter in 
cells and tissues that express it. Experimental data as provided in Figure 1 indicates 
expression in the brain. The method typically includes assaying the ability of the compound 
to modulate the expression of the transporter nucleic acid and thus identifying a compound 
that can be used to treat a disorder characterized by undesired transporter nucleic acid 
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expression. The assays can be performed in cell-based and cell-free systems. Cell-based 
assays include cells naturally expressing the transporter nucleic acid or recombinant cells 
genetically engineered to express specific nucleic acid sequences. 

The assay for transporter nucleic acid expression can involve direct assay of nucleic 
5 acid levels, such as mRNA levels, or on collateral compounds involved in the signal 

pathway. Further, the expression of genes that are up- or down-regulated in response to the 
transporter protein signal pathway can also be assayed. In this embodiment the regulatory 
regions of these genes can be operably linked to a reporter gene such as luciferase. 

Thus, modulators of transporter gene expression can be identified in a method 
}§ wherein a cell is contacted with a candidate compound and the expression of mRNA 

determined. The level of expression of transporter mRNA in the presence of the candidate 
compound is compared to the level of expression of transporter mRNA in the absence of the 
m candidate compound. The candidate compound can then be identified as a modulator of 
f I nucleic acid expression based on this comparison and be used, for example to treat a 
1 5 disorder characterized by aberrant nucleic acid expression. When expression of mRNA is 
l7l statistically significantly greater in the presence of the candidate compound than in its 
jj*f absence, the candidate compound is identified as a stimulator of nucleic acid expression. 
O When nucleic acid expression is statistically significantly less in the presence of the 
r ~ candidate compound than in its absence, the candidate compound is identified as an 
20 inhibitor of nucleic acid expression. 

The invention further provides methods of treatment, with the nucleic acid as a 
target, using a compound identified through drug screening as a gene modulator to modulate 
transporter nucleic acid expression in cells and tissues that express the transporter. 
Experimental data as provided in Figure 1 indicates that transporter proteins of the present 
25 invention are expressed in the brain. Specifically, a virtual northern blot shows expression 
in the human and fetal brain. In addition, PCR-based tissue screening panel confirms 
expression in human fetal brain. Modulation includes both up-regulation (i.e. activation or 
agonization) or down-regulation (suppression or antagonization) or nucleic acid expression. 
Alternatively, a modulator for transporter nucleic acid expression can be a small 
30 molecule or drug identified using the screening assays described herein as long as the drug 
or small molecule inhibits the transporter nucleic acid expression in the cells and tissues that 
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express the protein. Experimental data as provided in Figure 1 indicates expression in the 
brain. 

The nucleic acid molecules are also useful for monitoring the effectiveness of 
modulating compounds on the expression or activity of the transporter gene in clinical trials 

5 or in a treatment regimen. Thus, the gene expression pattern can serve as a barometer for 
the continuing effectiveness of treatment with the compound, particularly with compounds 
to which a patient can develop resistance. The gene expression pattern can also serve as a 
marker indicative of a physiological response of the affected cells to the compound. 
Accordingly, such monitoring would allow either increased administration of the compound 

Ifi or the administration of alternative compounds to which the patient has not become 
resistant. Similarly, if the level of nucleic acid expression falls below a desirable level, 
administration of the compound could be commensurately decreased. 

JS; The nucleic acid molecules are also useful in diagnostic assays for qualitative 

W changes in transporter nucleic acid expression, and particularly in qualitative changes that 

1 5 lead to pathology. The nucleic acid molecules can be used to detect mutations in transporter 
genes and gene expression products such as mRNA. The nucleic acid molecules can be 

fU used as hybridization probes to detect naturally occurring genetic mutations in the 

5 transporter gene and thereby to determine whether a subject with the mutation is at risk for a 
disorder caused by the mutation. Mutations include deletion, addition, or substitution of one 

20 or more nucleotides in the gene, chromosomal rearrangement, such as inversion or 

transposition, modification of genomic DNA, such as aberrant methylation patterns or 
changes in gene copy number, such as amplification. Detection of a mutated form of the 
transporter gene associated with a dysfunction provides a diagnostic tool for an active 
disease or susceptibility to disease when the disease results from overexpression, 

25 underexpression, or altered expression of a transporter protein. 

Individuals carrying mutations in the transporter gene can be detected at the nucleic 
acid level by a variety of techniques. Figure 3 provides information on SNPs that have been 
identified in a gene encoding the transporter protein of the present invention. 1 1 3 SNP 
variants were found, including 32 indels (indicated by a "-"). As indicated by the data 

30 presented in Figure 3, the map position was determined to be on chromosome 4 by ePCR, 
and confirmed with radiation hybrid mapping. Genomic DNA can be analyzed directly or 
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can be amplified by using PCR prior to analysis. RNA or cDNA can be used in the same 
way. In some uses, detection of the mutation involves the use of a probe/primer in a 
polymerase chain reaction (PCR) (see, e.g. U.S. Patent Nos. 4,683,195 and 4,683,202), such 
as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) (see, e.g., 
5 Landegran et al , Science 241 : 1 077-1 080 (1 988); and Nakazawa et al , PNAS 91 :360-364 
(1994)), the latter of which can be particularly useful for detecting point mutations in the 
gene (see Abravaya et al , Nucleic Acids Res. 25:675-682 (1995)). This method can include 
the steps of collecting a sample of cells from a patient, isolating nucleic acid (e.g., genomic, 
mRNA or both) from the cells of the sample, contacting the nucleic acid sample with one or 

10 more primers which specifically hybridize to a gene under conditions such that 

, n hybridization and amplification of the gene (if present) occurs, and detecting the presence or 
jjf absence of an amplification product, or detecting the size of the amplification product and 
ffl comparing the length to a control sample. Deletions and insertions can be detected by a 
jl change in size of the amplified product compared to the normal genotype. Point mutations 
15 can be identified by hybridizing amplified DNA to normal RNA or antisense DNA 
2 sequences. 

: j Alternatively, mutations in a transporter gene can be directly identified, for example, 

11 by alterations in restriction enzyme digestion patterns determined by gel electrophoresis, 
r* Further, sequence-specific ribozymes (U.S. Patent No. 5,498,53 1) can be used to 
20 score for the presence of specific mutations by development or loss of a ribozyme cleavage 

site. Perfectly matched sequences can be distinguished from mismatched sequences by 
nuclease cleavage digestion assays or by differences in melting temperature. 

Sequence changes at specific locations can also be assessed by nuclease protection 
assays such as RNase and SI protection or the chemical cleavage method. Furthermore, 

25 sequence differences between a mutant transporter gene and a wild-type gene can be 

determined by direct DNA sequencing. A variety of automated sequencing procedures can 
be utilized when performing the diagnostic assays (Naeve, C.W., (1995) Biotechniques 
79:448), including sequencing by mass spectrometry (see, e.g., PCT International 
Publication No. WO 94/16101; Cohen ef al,Adv. Chromatogr. 36:121-162 (1996); and 

30 Griffin et al, Appl Biochem. Biotechnol 55:147-159 (1993)). 
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Other methods for detecting mutations in the gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in RNA/KNA or 
RNA/DNA duplexes (Myers et al , Science 230: 1 242 (1985)); Cotton et al , PNAS 55:4397 
(1988); Saleeba et al, Meth Enzymol 277:286-295 (1992)), electrophoretic mobility of 
5 mutant and wild type nucleic acid is compared (Orita et al , PNAS 86:2766 (1 989); Cotton et 
al.Mutat. Res. 255:125-144 (1993); and Hayashi et al, Genet Anal Tech. Appl 9:73-79 
(1992)), and movement of mutant or wild-type fragments in polyacrylamide gels containing 
a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (Myers et 
al, Nature 573:495 (1985)). Examples of other techniques for detecting point mutations 
M) include selective oligonucleotide hybridization, selective amplification, and selective primer 
extension. 

H The nucleic acid molecules are also useful for testing an individual for a genotype 

m that while not necessarily causing the disease, nevertheless affects the treatment modality. 

^ I Thus, the nucleic acid molecules can be used to study the relationship between an 

4 5 individual's genotype and the individual's response to a compound used for treatment 

Q (pharmacogenomic relationship). Accordingly, the nucleic acid molecules described herein 

jl* can be used to assess the mutation content of the transporter gene in an individual in order to 

O select an appropriate compound or dosage regimen for treatment. Figure 3 provides 

information on SNPs that have been identified in a gene encoding the transporter protein of 
20 the present invention. 1 1 3 SNP variants were found, including 32 indels (indicated by a "- 

Thus nucleic acid molecules displaying genetic variations that affect treatment 
provide a diagnostic target that can be used to tailor treatment in an individual. 
Accordingly, the production of recombinant cells and animals containing these 
25 polymorphisms allow effective clinical design of treatment compounds and dosage 
regimens. 

The nucleic acid molecules are thus useful as antisense constructs to control 
transporter gene expression in cells, tissues, and organisms. A DNA antisense nucleic acid 
molecule is designed to be complementary to a region of the gene involved in transcription, 
30 preventing transcription and hence production of transporter protein. An antisense RNA or 
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DNA nucleic acid molecule would hybridize to the mRNA and thus block translation of 
mRNA into transporter protein. 

Alternatively, a class of antisense molecules can be used to inactivate mRNA in 
order to decrease expression of transporter nucleic acid. Accordingly, these molecules can 
5 treat a disorder characterized by abnormal or undesired transporter nucleic acid expression. 
This technique involves cleavage by means of ribozymes containing nucleotide sequences 
complementary to one or more regions in the mRNA that attenuate the ability of the mRNA 
to be translated. Possible regions include coding regions and particularly coding regions 
corresponding to the catalytic and other functional activities of the transporter protein, such 
10 as ligand binding. 

The nucleic acid molecules also provide vectors for gene therapy in patients 
r: containing cells that are aberrant in transporter gene expression. Thus, recombinant cells, 
v3 which include the patient's cells that have been engineered ex vivo and returned to the 
-ji patient, are introduced into an individual where the cells produce the desired transporter 
Y5 protein to treat the individual. 

Q The invention also encompasses kits for detecting the presence of a transporter 

fa nucleic acid in a biological sample. Experimental data as provided in Figure 1 indicates that 

transporter proteins of the present invention are expressed in the brain. Specifically, a 
M= virtual northern blot shows expression in the human and fetal brain. In addition, PCR-based 
20 tissue screening panel confirms expression in human fetal brain. For example, the kit can 
comprise reagents such as a labeled or labelable nucleic acid or agent capable of detecting 
transporter nucleic acid in a biological sample; means for determining the amount of 
transporter nucleic acid in the sample; and means for comparing the amount of transporter 
nucleic acid in the sample with a standard. The compound or agent can be packaged in a 
25 suitable container. The kit can further comprise instructions for using the kit to detect 
transporter protein mRNA or DNA. 

Nucleic Acid Arrays 

The present invention further provides nucleic acid detection kits, such as arrays 
30 or microarrays of nucleic acid molecules that are based on the sequence information 
provided in Figures 1 and 3 (SEQ ID NOS:l and 3). 
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As used herein "Arrays" or "Microarrays" refers to an array of distinct 
polynucleotides or oligonucleotides synthesized on a substrate, such as paper, nylon or 
other type of membrane, filter, chip, glass slide, or any other suitable solid support. In 
one embodiment, the microarray is prepared and used according to the methods described 
in US Patent 5,837,832, Chee et aL, PCT application W095/1 1995 (Chee et al\ 
Lockhart, D. J. et al (1996; Nat Biotech. 14: 1675-1680) and Schena, M. et al (1996; 
Proc. Natl. Acad. Sci. 93: 10614-10619), all of which are incorporated herein in their 
entirety by reference. In other embodiments, such arrays are produced by the methods 
described by Brown et al, US Patent No. 5,807,522. 

The microarray or detection kit is preferably composed of a large number of 
unique, single-stranded nucleic acid sequences, usually either synthetic antisense 
oligonucleotides or fragments of cDNAs, fixed to a solid support. The oligonucleotides 
are preferably about 6-60 nucleotides in length, more preferably 15-30 nucleotides in 
length, and most preferably about 20-25 nucleotides in length. For a certain type of 
microarray or detection kit, it may be preferable to use oligonucleotides that are only 7- 
20 nucleotides in length. The microarray or detection kit may contain oligonucleotides 
that cover the known 5', or 3\ sequence, sequential oligonucleotides that cover the full 
length sequence; or unique oligonucleotides selected from particular areas along the 
length of the sequence. Polynucleotides used in the microarray or detection kit may be 
oligonucleotides that are specific to a gene or genes of interest. 

In order to produce oligonucleotides to a known sequence for a microarray or 
detection kit, the gene(s) of interest (or an ORF identified from the contigs of the present 
invention) is typically examined using a computer algorithm which starts at the 5' or at 
the 3' end of the nucleotide sequence. Typical algorithms will then identify oligomers of 
defined length that are unique to the gene, have a GC content within a range suitable for 
hybridization, and lack predicted secondary structure that may interfere with 
hybridization. In certain situations it may be appropriate to use pairs of oligonucleotides 
on a microarray or detection kit. The "pairs" will be identical, except for one nucleotide 
that preferably is located in the center of the sequence. The second oligonucleotide in the 
pair (mismatched by one) serves as a control. The number of oligonucleotide pairs may 
range from two to one million. The oligomers are synthesized at designated areas on a 
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substrate using a light-directed chemical process. The substrate may be paper, nylon or 
other type of membrane, filter, chip, glass slide or any other suitable solid support. 

In another aspect, an oligonucleotide may be synthesized on the surface of the 
substrate by using a chemical coupling procedure and an ink jet application apparatus, as 
5 described in PCT application W095/25 1116 (Baldeschweiler et al ) which is incorporated 
herein in its entirety by reference. In another aspect, a "gridded" array analogous to a dot 
(or slot) blot may be used to arrange and link cDN A fragments or oligonucleotides to the 
surface of a substrate using a vacuum system, thermal, UV, mechanical or chemical 
bonding procedures. An array, such as those described above, may be produced by hand 
tQ or by using available devices (slot blot or dot blot apparatus), materials (any suitable 
2 solid support), and machines (including robotic instruments), and may contain 8, 24, 96, 
U 384, 1 536, 6 144 or more oligonucleotides, or any other number between two and one 
m million which lends itself to the efficient use of commercially available instrumentation. 
^ In order to conduct sample analysis using a microarray or detection kit, the RNA 

1 5 or DNA from a biological sample is made into hybridization probes. The mRNA is 

isolated, and cDNA is produced and used as a template to make antisense RNA (aRNA). 
W The aRNA is amplified in the presence of fluorescent nucleotides, and labeled probes are 
Q incubated with the microarray or detection kit so that the probe sequences hybridize to 
^ complementary oligonucleotides of the microarray or detection kit. Incubation conditions 
20 are adjusted so that hybridization occurs with precise complementary matches or with 
various degrees of less complementarity. After removal of nonhybridized probes, a 
scanner is used to determine the levels and patterns of fluorescence. The scanned images 
are examined to determine degree of complementarity and the relative abundance of each 
oligonucleotide sequence on the microarray or detection kit. The biological samples may 
25 be obtained from any bodily fluids (such as blood, urine, saliva, phlegm, gastric juices, 
etc.), cultured cells, biopsies, or other tissue preparations. A detection system may be 
used to measure the absence, presence, and amount of hybridization for all of the distinct 
sequences simultaneously. This data may be used for large-scale correlation studies on 
the sequences, expression patterns, mutations, variants, or polymorphisms among 
30 samples. 
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Using such arrays, the present invention provides methods to identify the 
expression of the transporter proteins/peptides of the present invention. In detail, such 
methods comprise incubating a test sample with one or more nucleic acid molecules and 
assaying for binding of the nucleic acid molecule with components within the test 
5 sample. Such assays will typically involve arrays comprising many genes, at least one of 
which is a gene of the present invention and or alleles of the transporter gene of the 
present invention. Figure 3 provides information on SNPs that have been identified in a 
gene encoding the transporter protein of the present invention. 113 SNP variants were 
found, including 32 indels (indicated by a "-"). 

10 Conditions for incubating a nucleic acid molecule with a test sample vary. 

^ Incubation conditions depend on the format employed in the assay, the detection methods 

ill employed, and the type and nature of the nucleic acid molecule used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 

VI amplification or array assay formats can readily be adapted to employ the novel 

4 5 fragments of the Human genome disclosed herein. Examples of such assays can be found 
R in Chard, T, An Introduction to Radioimmunoassay and Related Techniques, Elsevier 

fij Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. et al, 

5 Techniques in Irnmunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1 982), Vol. 2 
^ (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of Enzyme Immunoassays: 

20 Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science 

Publishers, Amsterdam, The Netherlands (1985). 

The test samples of the present invention include cells, protein or membrane 

extracts of cells. The test sample used in the above-described method will vary based on 

the assay format, nature of the detection method and the tissues, cells or extracts used as 
25 the sample to be assayed. Methods for preparing nucleic acid extracts or of cells are well 

known in the art and can be readily be adapted in order to obtain a sample that is 

compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain 

the necessary reagents to carry out the assays of the present invention. 
30 Specifically, the invention provides a compartmentalized kit to receive, in close 

confinement, one or more containers which comprises: (a) a first container comprising 
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one of the nucleic acid molecules that can bind to a fragment of the Human genome 
disclosed herein; and (b) one or more other containers comprising one or more of the 
following: wash reagents, reagents capable of detecting presence of a bound nucleic acid. 

In detail, a compartmentalized kit includes any kit in which reagents are contained 
in separate containers. Such containers include small glass containers, plastic containers, 
strips of plastic, glass or paper, or arraying material such as silica. Such containers 
allows one to efficiently transfer reagents from one compartment to another compartment 
such that the samples and reagents are not cross-contaminated, and the agents or solutions 
of each container can be added in a quantitative fashion from one compartment to 
another. Such containers will include a container which will accept the test sample, a 
container which contains the nucleic acid probe, containers which contain wash reagents 
(such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the 
reagents used to detect the bound probe. One skilled in the art will readily recognize that 
the previously unidentified transporter gene of the present invention can be routinely 
identified using the sequence information disclosed herein can be readily incorporated 
into one of the established kit formats which are well known in the art, particularly 
expression arrays. 

Vectors/host cells 

The invention also provides vectors containing the nucleic acid molecules described 
herein. The term "vector" refers to a vehicle, preferably a nucleic acid molecule, which can 
transport the nucleic acid molecules. When the vector is a nucleic acid molecule, the 
nucleic acid molecules are covalently linked to the vector nucleic acid. With this aspect of 
the invention, the vector includes a plasmid, single or double stranded phage, a single or 
double stranded RNA or DNA viral vector, or artificial chromosome, such as a BAC, PAC, 
YAC, OR MAC. 

A vector can be maintained in the host cell as an extrachromosomal element where it 
replicates and produces additional copies of the nucleic acid molecules. Alternatively, the 
vector may integrate into the host cell genome and produce additional copies of the nucleic 
acid molecules when the host cell replicates. 
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The invention provides vectors for the maintenance (cloning vectors) or vectors for 
expression (expression vectors) of the nucleic acid molecules. The vectors can function in 
procaiyotic or eukaryotic cells or in both (shuttle vectors). 

Expression vectors contain cis-acting regulatory regions that are operably linked in 
5 the vector to the nucleic acid molecules such that transcription of the nucleic acid molecules 
is allowed in a host cell. The nucleic acid molecules can be introduced into the host cell 
with a separate nucleic acid molecule capable of affecting transcription. Thus, the second 
nucleic acid molecule may provide a trans-acting factor interacting with the cis-regulatory 
control region to allow transcription of the nucleic acid molecules from the vector, 
if Alternatively, a trans-acting factor may be supplied by the host cell. Finally, a trans-acting 
£ factor can be produced from the vector itself It is understood, however, that in some 
H embodiments, transcription and/or translation of the nucleic acid molecules can occur in a 
: ^ cell-free system. 

The regulatory sequence to which the nucleic acid molecules described herein can 
A 5 be operably linked include promoters for directing mRNA transcription. These include, but 
§71 are not limited to, the left promoter from bacteriophage X, the lac, TRP, and TAC promoters 
jjj from E. colU the early and late promoters from SV40, the CMV immediate early promoter, 
O the adenovirus early and late promoters, and retrovirus long-terminal repeats. 
r In addition to control regions that promote transcription, expression vectors may also 
20 include regions that modulate transcription, such as repressor binding sites and enhancers. 
Examples include the SV40 enhancer, the cytomegalovirus immediate early enhancer, 
polyoma enhancer, adenovirus enhancers, and retrovirus LTR enhancers. 

In addition to containing sites for transcription initiation and control, expression 
vectors can also contain sequences necessary for transcription termination and, in the 
25 transcribed region a ribosome binding site for translation. Other regulatory control elements 
for expression include initiation and termination codons as well as polyadenylation signals. 
The person of ordinary skill in the art would be aware of the numerous regulatory sequences 
that are useful in expression vectors. Such regulatory sequences are described, for example, 
in Sambrook et al , Molecular Cloning: A Laboratory Manual 2nd ed. 9 Cold Spring Harbor 
30 Laboratory Press, Cold Spring Harbor, NY, (1 989). 
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A variety of expression vectors can be used to express a nucleic acid molecule. 
Such vectors include chromosomal, episomal, and virus-derived vectors, for example 
vectors derived from bacterial plasmids, from bacteriophage, from yeast episomes, from 
yeast chromosomal elements, including yeast artificial chromosomes, from viruses such as 
5 baculoviruses, papovaviruses such as SV40, Vaccinia viruses, adenoviruses, poxviruses, 
pseudorabies viruses, and retroviruses. Vectors may also be derived from combinations of 
these sources such as those derived from plasmid and bacteriophage genetic elements, e.g. 
cosmids and phagemids. Appropriate cloning and expression vectors for prokaryotic and 
eukaryotic hosts are described in Sambrook et al, Molecular Cloning: A Laboratory 
|f Manual 2nd. ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, (1 989). 

The regulatory sequence may provide constitutive expression in one or more host 
is* ce u s e tissue specific) or may provide for inducible expression in one or more cell types 
ffi such as by temperature, nutrient additive, or exogenous factor such as a hormone or other 
f J Iigand. A variety of vectors providing for constitutive and inducible expression in 
€ 5 prokaryotic and eukaryotic hosts are well known to those of ordinary skill in the art. 
:7s The nucleic acid molecules can be inserted into the vector nucleic acid by well- 

■Jjf known methodology. Generally, the DNA sequence that will ultimately be expressed is 
O joined to an expression vector by cleaving the DNA sequence and the expression vector 

with one or more restriction enzymes and then ligating the fragments together. Procedures 
20 for restriction enzyme digestion and ligation are well known to those of ordinary skill in the 
art. 

The vector containing the appropriate nucleic acid molecule can be introduced into 
an appropriate host cell for propagation or expression using well-known techniques. 
Bacterial cells include, but are not limited to, E. coli, Streptomyces, and Salmonella 

25 typhimurium. Eukaryotic cells include, but are not limited to, yeast, insect cells such as 
Drosophila, animal cells such as COS and CHO cells, and plant cells. 

As described herein, it may be desirable to express the peptide as a fusion protein. 
Accordingly, the invention provides fusion vectors that allow for the production of the 
peptides. Fusion vectors can increase the expression of a recombinant protein, increase the 

30 solubility of the recombinant protein, and aid in the purification of the protein by acting for 
example as a ligand for affinity purification. A proteolytic cleavage site may be introduced 
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at the junction of the fusion moiety so that the desired peptide can ultimately be separated 
from the fusion moiety. Proteolytic enzymes include, but are not limited to, factor Xa, 
thrombin, and enterotransporter. Typical fusion expression vectors include pGEX (Smith et 
al, Gene (57:31-40 (1988)), pMAL (New England Biolabs, Beverly, MA) and pRIT5 
5 (Pharmacia, Piscataway, NJ) which fuse glutathione S-transferase (GST), maltose E binding 
protein, or protein A, respectively, to the target recombinant protein. Examples of suitable 
inducible non-fusion E. coli expression vectors include pTrc (Amann et al , Gene 69:301 - 
315 (1988)) and pET 1 Id (Studier et al, Gene Expression Technology: Methods in 
Enzymology 7^5:60-89 (1990)). 
lifl Recombinant protein expression can be maximized in host bacteria by providing a 

S genetic background wherein the host cell has an impaired capacity to proteolytically cleave 
N b the recombinant protein. (Gottesman, S., Gene Expression Technology: Methods in 
rfi Enzymology 185, Academic Press, San Diego, California (1990) 1 19-128). Alternatively, 
If i the sequence of the nucleic acid molecule of interest can be altered to provide preferential 
1=5 codon usage for a specific host cell, for example E. coli. (Wada et al , Nucleic Acids Res. 
US 20:2111-2118(1992)). 

lr The nucleic acid molecules can also be expressed by expression vectors that are 

O operative in yeast. Examples of vectors for expression in yeast e.g., S. cerevisiae include 
^ pYepSecl (Baldari, et al, EMBO 1 6:229-234 (1987)), pMFa (Kurjan et al, Cell 30:933- 
20 943(1982)), pJRY88 (Schultz etal, Gene 54:113-123 (1987)), and pYES2 (Invitrogen 

Corporation, San Diego, CA). 

The nucleic acid molecules can also be expressed in insect cells using, for example, 

baculo virus expression vectors. Baculovirus vectors available for expression of proteins in 

cultured insect cells (e.g., Sf 9 cells) include the pAc series (Smith et al, Mol Cell Biol 
25 5:2156-2165 (1983)) and the pVL series (Lucklow et al, Virology 770:31-39 (1989)). 

In certain embodiments of the invention, the nucleic acid molecules described herein 

are expressed in mammalian cells using mammalian expression vectors. Examples of 

mammalian expression vectors include pCDM8 (Seed, B. Nature 329:840(1987)) and 

pMT2PC (Kaufman etal, EMBO I 6:187-195 (1987)). 
30 The expression vectors listed herein are provided by way of example only of the 

well-known vectors available to those of ordinary skill in the art that would be useful to 
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express the nucleic acid molecules. The person of ordinary skill in the art would be aware 
of other vectors suitable for maintenance propagation or expression of the nucleic acid 
molecules described herein. These are found for example in Sambrook, J., Fritsh, E. F., and 
Maniatis, T. Molecular Cloning: A Laboratory Manual 2nd, ed, Cold Spring Harbor 
5 Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989. 

The invention also encompasses vectors in which the nucleic acid sequences 
described herein are cloned into the vector in reverse orientation, but operably linked to a 
regulatory sequence that permits transcription of antisense RNA. Thus, an antisense 
transcript can be produced to all, or to a portion, of the nucleic acid molecule sequences 
10 described herein, including both coding and non-coding regions. Expression of this 
m antisense RNA is subject to each of the parameters described above in relation to expression 
[7 of the sense RNA (regulatory sequences, constitutive or inducible expression, tissue-specific 
01 expression). 

Cj The invention also relates to recombinant host cells containing the vectors described 

15 herein. Host cells therefore include prokaryotic cells, lower eukaryotic cells such as yeast, 
Ly other eukaryotic cells such as insect cells, and higher eukaryotic cells such as mammalian 
^ cells. 

O The recombinant host cells are prepared by introducing the vector constructs 

described herein into the cells by techniques readily available to the person of ordinary skill 

20 in the art. These include, but are not limited to, calcium phosphate transfection, DEAE- 
dextran-mediated transfection, cationic lipid-mediated transfection, electroporation, 
transduction, infection, lipofection, and other techniques such as those found in Sambrook, 
etal {Molecular Cloning: A Laboratory Manual 2nd, ed, Cold Spring Harbor 
Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989). 

25 Host cells can contain more than one vector. Thus, different nucleotide sequences 

can be introduced on different vectors of the same cell. Similarly, the nucleic acid 
molecules can be introduced either alone or with other nucleic acid molecules that are not 
related to the nucleic acid molecules such as those providing trans-acting factors for 
expression vectors. When more than one vector is introduced into a cell, the vectors can be 

30 introduced independently, co-introduced or joined to the nucleic acid molecule vector. 
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In the case of bacteriophage and viral vectors, these can be introduced into cells as 
packaged or encapsulated virus by standard procedures for infection and transduction. Viral 
vectors can be replication-competent or replication-defective. In the case in which viral 
replication is defective, replication will occur in host cells providing functions that 

5 complement the defects. 

Vectors generally include selectable markers that enable the selection of the 
subpopulation of cells that contain the recombinant vector constructs. The marker can be 
contained in the same vector that contains the nucleic acid molecules described herein or 
may be on a separate vector. Markers include tetracycline or ampicillin-resistance genes for 

IS prokaryotic host cells and dihydrofolate reductase or neomycin resistance for eukaryotic 

M host cells. However, any marker that provides selection for a phenotypic trait will be 

^ effective. 

m While the mature proteins can be produced in bacteria, yeast, mammalian cells, and 

!f J other cells under the control of the appropriate regulatory sequences, cell- free transcription 
15 and translation systems can also be used to produce these proteins using RNA derived from 
j7| the DNA constructs described herein. 

J Where secretion of the peptide is desired, which is difficult to achieve with multi- 

O transmembrane domain containing proteins such as transporters, appropriate secretion 
~ signals are incorporated into the vector. The signal sequence can be endogenous to the 
20 peptides or heterologous to these peptides. 

Where the peptide is not secreted into the medium, which is typically the case with 

transporters, the protein can be isolated from the host cell by standard disruption procedures, 

including freeze thaw, sonication, mechanical disruption, use of lysing agents and the like. 

The peptide can then be recovered and purified by well-known purification methods 
25 including ammonium sulfate precipitation, acid extraction, anion or cationic exchange 

chromatography, phosphocellulose chromatography, hydrophobic-interaction 

chromatography, affinity chromatography, hydroxylapatite chromatography, lectin 

chromatography, or high performance liquid chromatography. 

It is also understood that depending upon the host cell in recombinant production of 
30 the peptides described herein, the peptides can have various glycosylation patterns, 

depending upon the cell, or maybe non-glycosylated as when produced in bacteria. In 
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addition, the peptides may include an initial modified methionine in some cases as a result 
of a host-mediated process. 

Uses of vectors and host cells 

5 The recombinant host cells expressing the peptides described herein have a variety 

of uses. First, the cells are useful for producing a transporter protein or peptide that can be 
further purified to produce desired amounts of transporter protein or fragments. Thus, host 
cells containing expression vectors are useful for peptide production. 

Host cells are also useful for conducting cell-based assays involving the transporter 

II protein or transporter protein fragments, such as those described above as well as other 
formats known in the art. Thus, a recombinant host cell expressing a native transporter 

CO protein is useful for assaying compounds that stimulate or inhibit transporter protein 

[p function. 

^ Host cells are also useful for identifying transporter protein mutants in which these 

13" functions are affected. If the mutants naturally occur and give rise to a pathology, host cells 
fll containing the mutations are useful to assay compounds that have a desired effect on the 

mutant transporter protein (for example, stimulating or inhibiting function) which may not 
M= be indicated by their effect on the native transporter protein. 

Genetically engineered host cells can be further used to produce non-human 
20 transgenic animals. A transgenic animal is preferably a mammal, for example a rodent, such 
as a rat or mouse, in which one or more of the cells of the animal include a transgene. A 
transgene is exogenous DNA that is integrated into the genome of a cell from which a 
transgenic animal develops and which remains in the genome of the mature animal in one or 
more cell types or tissues of the transgenic animal. These animals are useful for studying 
25 the function of a transporter protein and identifying and evaluating modulators of transporter 
protein activity. Other examples of transgenic animals include non-human primates, sheep, 
dogs, cows, goats, chickens, and amphibians. 

A transgenic animal can be produced by introducing nucleic acid into the male 
pronuclei of a fertilized oocyte, e.g., by microinjection, retroviral infection, and allowing the 
30 oocyte to develop in a pseudopregnant female foster animal. Any of the transporter protein 
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nucleotide sequences can be introduced as a transgene into the genome of a non-human 
animal, such as a mouse. 

Any of the regulatory or other sequences useful in expression vectors can form part 
of the transgenic sequence. This includes intronic sequences and polyadenylation signals, if 
5 not already included. A tissue-specific regulatory sequence(s) can be operably linked to the 
transgene to direct expression of the transporter protein to particular cells. 

Methods for generating transgenic animals via embryo manipulation and 
microinjection, particularly animals such as mice, have become conventional in the art and 
are described, for example, in U.S. Patent Nos. 4,736,866 and 4,870,009, both by Leder et 
1 § al, U.S. Patent No. 4,873,191 by Wagner et al and in Hogan, B., Manipulating the Mouse 
,3! Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Similar 
fis methods are used for production of other transgenic animals. A transgenic founder animal 
can be identified based upon the presence of the transgene in its genome and/or expression 
\J of transgenic mRNA in tissues or cells of the animals. A transgenic founder animal can then 
fe be used to breed additional animals carrying the transgene. Moreover, transgenic animals 
^ carrying a transgene can further be bred to other transgenic animals carrying other 
il| transgenes. A transgenic animal also includes animals in which the entire animal or tissues 
in the animal have been produced using the homologously recombinant host cells described 
herein. 

20 In another embodiment, transgenic non-human animals can be produced which 

contain selected systems that allow for regulated expression of the transgene. One example 
of such a system is the cre/loxP recombinase system of bacteriophage PI . For a description 
of the cre/loxP recombinase system, see, e.g., Lakso et al PNAS 89:6232-6236 (1 992). 
Another example of a recombinase system is the FLP recombinase system of S. cerevisiae 

25 (O'Gorman et al Science 257:1351-1355 (1991). If a cre/loxP recombinase system is used 
to regulate expression of the transgene, animals containing transgenes encoding both the Cre 
recombinase and a selected protein is required. Such animals can be provided through the 
construction of "double" transgenic animals, e.g., by mating two transgenic animals, one 
containing a transgene encoding a selected protein and the other containing a transgene 

3 0 encoding a recombinase. 
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Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut, L et al Nature 555:810-813 (1997) and PCT 
International Publication Nos. WO 97/07668 and WO 97/07669. In brief, a cell, e.g., a 
somatic cell, from the transgenic animal can be isolated and induced to exit the growth cycle 
5 and enter G 0 phase. The quiescent cell can then be fused, e.g., through the use of electrical 
pulses, to an enucleated oocyte from an animal of the same species from which the 
quiescent cell is isolated. The reconstructed oocyte is then cultured such that it develops to 
morula or blastocyst and then transferred to pseudopregnant female foster animal. The 
offspring born of this female foster animal will be a clone of the animal from which the cell, 
IQl e.g., the somatic cell, is isolated. 

5 Transgenic animals containing recombinant cells that express the peptides described 

herein are useful to conduct the assays described herein in an in vivo context. Accordingly, 
pi the various physiological factors that are present in vivo and that could effect ligand binding, 
'? \ transporter protein activation, and signal transduction, may not be evident from in vitro cell- 
K> free or cell-based assays. Accordingly, it is useful to provide non-human transgenic animals 
til to assay in vivo transporter protein function, including ligand interaction, the effect of 
% specific mutant transporter proteins on transporter protein function and ligand interaction, 
O and the effect of chimeric transporter proteins. It is also possible to assess the effect of null 

mutations, that is mutations that substantially or completely eliminate one or more 
20 transporter protein functions. 

All publications and patents mentioned in the above specification are herein 
incorporated by reference. Various modifications and variations of the described method 
and system of the invention will be apparent to those skilled in the art without departing 
from the scope and spirit of the invention. Although the invention has been described in 
25 connection with specific preferred embodiments, it should be understood that the 

invention as claimed should not be unduly limited to such specific embodiments. Indeed, 
various modifications of the above-described modes for carrying out the invention which 
are obvious to those skilled in the field of molecular biology or related fields are intended 
to be within the scope of the following claims. 



61 



CL001006-CIP 



SEQUENCE LISTING 

<110> BRANDON, Rhonda et al. 

<120> ISOLATED HUMAN TRANSPORTER PROTEINS, 

NUCLEIC ACID MOLECULES ENCODING HUMAN TRANSPORTER PROTEINS, 
AND USES THEREOF 

<130> CL001006-CIP 

<150> 09/730,002 
<151> 2000-12-06 

<160> 4 

<170> FastSEQ for Windows Version 4.0 

<210> 1 

<211> 2819 

<212> DNA 

<213> Human 

<400> 1 

tgaggtgcac tgcctttcca cactctccct tctgtactca gccagctgct gctgaggtgg 60 
gaggaaaagt cctggctggg agaattgagc tagtgcagca cacgtaaaaa agcgattccg 120 
atgggtcctt tgaaagcttt tctcttctcc ccttttcttc tgcggagtca aagtagaggg 180 
gtgaggttgg tcttcttgtt actgaccctg catttgggaa actgtgttga taaggcagat 240 
gatgaagatg atgaggattt aacggtgaac aaaacctggg tcttggcccc aaaaattcat 300 
gaaggagata tcacacaaat tctgaattca ttgcttcaag gctatgacaa taaacttcgt 360 
ccagatatag gagtgaggcc cacggtaatt gaaactgatg tttatgtaaa cagcattgga 420 
ccagttgatc caattaatat ggaatataca atagatataa tttttgccca aacctggttt 480 
gacagtcgtt taaaattcaa tagtaccatg aaagtgctta tgcttaacag taatatggtt 540 
ggaaaaattt ggattcctga cactttcttc agaaactcaa gaaaatctga tgctcactgg 600 
ataacaactc ctaatcgtct gcttcgaatt tggaatgatg gacgagttct gtatactcta 660 
agattgacaa ttaatgcaga atgttatctt cagcttcata actttcccat ggatgaacat 720 
tcctgtccac tggaattttc aagctatgga taccctaaaa atgaaattga gtataagtgg 780 
aaaaagccct ccgtagaagt ggctgatcct aaatactgga gattatatca gtttgcattt 840 
gtagggttac ggaactcaac tgaaatcact cacacgatct ctggggatta tgttatcatg 900 
acaatttttt ttgacctgag cagaagaatg ggatatttca ctattcagac ctacattcca 960 
tgcattctga cagttgttct ttcttgggtg tctttttgga tcaataaaga tgcagtgcct 1020 
gcaagaacat cgttgggtat cactacagtt ctgactatga caaccctgag tacaattgcc 1080 
aggaagtctt tacctaaggt ttcttatgtg actgcgatgg atctctttgt ttctgtttgt 1140 
ttcatttttg tttttgcagc cttgatggaa tatggaacct tgcattattt taccagcaac 1200 
caaaaaggaa agactgctac taaagacaga aagctaaaaa ataaagcctc gatgactcct 1260 
ggtctccatc ctggatccac tctgattcca atgaataata tttctgtgcc gcaagaagat 1320 
gattatgggt atcagtgttt ggagggcaaa gattgtgcca gcttcttctg ttgctttgaa 1380 
gactgcagaa caggatcttg gagggaagga aggatacaca tacgcattgc caaaattgac 1440 
tcttattcta gaatattttt cccaaccgct tttgccctgt tcaacttggt ttattgggtt 1500 
ggctatcttt acttataaaa tctacttcat aagcaaaaat caaaagaagt ctgactaaat 1560 
tcagtagaat cttttgtact tcagtaactt gaagtttaaa tttaaaatgc agagagacca 1620 
atggttaaaa tgtgaatagt attgtaacta ttttaaggcc ttcagaagta aataaagtag 1680 
cagctttcag gctaatttac gtgaaactga ttagttgcaa aatccagtag gttaaaatac 1740 
tcacatattt ttacttaaat tttctttaat ttacttatat gttattataa ttttgaattt 1800 
ttaagttcta tgattcatgt tttaaagatg gaatagtttt aatacatatt ttgtttaaat 1860 
ataatctata attgttttgt aatgtaagac taattactaa tatttatgta gcaacttttg 1920 
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tgccgaaaaa agactgttaa tttgtttttt 
atacagttag ttgataacat aaagccatag 
ttcccaagaa gaaattgatt tatttttaaa 
caatttttac ctgtctttta atccagtcca 
gttgcattta ctttggtgat ttgcaaactt 
tgcaagatga aaaaatcaag gtaaaatcta 
taatggtttg atgaggttaa cactgaaata 
aattaaatgc cacagagaaa tggattattt 
ggcagcgccc ttacaattaa aagtatttga 
gtgggttaat ataaattatt gaccctgggc 
gggaggaccg aaaattcttt aaatattgct 
tgataattga cgcccgaaaa aacaggggtt 
cctttttttc aatgtttcaa aggattcttc 
ctcgcttata ttttgccgtc gctctcagga 
gaaaaacctg gggggggggc acagcgccgt 



cttgcttttc atttgattac cctgcttgaa 1980 
ttttcttgga ttttcttcca agatattgta 2040 
ctatcagtta ctgaagactt atgaaaaggt 2100 
ttttctgaca caatattaaa cagaacgcca 2160 
ggaatgaagc caccagtcat tttttaagaa 2220 
accattttat tctctgcttc atagcattta 2280 
ttaaaatctg caaatgcacc attaatggcg 2340 
tttgtcttta tggttttatg gaaaggtgtt 2400 
ggaaacaatt ggcttggaat ttaactggat 24 60 
aaaactgggg gttgggccgg ggtaacaatt 2520 
gctttattag gcgctcgggg tttaaaacct 2580 
taaggggccc cgccggcagg caaatttggc 2640 
acggacccct cgttcaatgt tttgcgaaca 2700 
ctgccgtcgt ccttacatta ttatcgctaa 2760 
ctcaaagaat ctgtgggttt actgagcgc 2819 



<210> 2 
<211> 465 
<212> PRT 
<213> Human 



<400> 2 

Met Gly Pro Leu 
1 

Gin Ser Arg Gly 
20 

Gly Asn Cys Val 
35 

Val Asn Lys Thr 
50 

Thr Gin lie Leu 
65 

Pro Asp lie Gly 

Asn Ser lie Gly 
100 

He He Phe Ala 
115 

Thr Met Lys Val 
130 

He Pro Asp Thr 
145 

He Thr Thr Pro 

Leu Tyr Thr Leu 
180 

His Asn Phe Pro 
195 

Tyr Gly Tyr Pro 
210 

Val Glu Val Ala 
225 

Val Gly Leu Arg 

Tyr Val He Met 
260 

Phe Thr He Gin 



Lys Ala Phe Leu 
5 

Val Arg Leu Val 

Asp Lys Ala Asp 
40 

Trp Val Leu Ala 
55 

Asn Ser Leu Leu 
70 

Val Arg Pro Thr 
85 

Pro Val Asp Pro 

Gin Thr Trp Phe 

120 

Leu Met Leu Asn 
135 

Phe Phe Arg Asn 
150 

Asn Arg Leu Leu 
165 

Arg Leu Thr He 

Met Asp Glu His 
200 

Lys Asn Glu He 
215 

Asp Pro Lys Tyr 
230 

Asn Ser Thr Glu 
245 

Thr He Phe Phe 
Thr Tyr He Pro 



Phe Ser Pro Phe 
10 

Phe Leu Leu Leu 
25 

Asp Glu Asp Asp 

Pro Lys He His 
60 

Gin Gly Tyr Asp 
75 

Val He Glu Thr 
90 

He Asn Met Glu 
105 

Asp Ser Arg Leu 

Ser Asn Met Val 
140 

Ser Arg Lys Ser 
155 

Arg He Trp Asn 
170 

Asn Ala Glu Cys 
185 

Ser Cys Pro Leu 

Glu Tyr Lys Trp 
220 

Trp Arg Leu Tyr 
235 

He Thr His Thr 
250 

Asp Leu Ser Arg 
265 

Cys He Leu Thr 



Leu Leu Arg Ser 

15 

Thr Leu His Leu 
30 

Glu Asp Leu Thr 
45 

Glu Gly Asp He 

Asn Lys Leu Arg 
80 

Asp Val Tyr Val 
95 

Tyr Thr He Asp 
110 

Lys Phe Asn Ser 
125 

Gly Lys He Trp 

Asp Ala His Trp 
160 

Asp Gly Arg Val 
175 

Tyr Leu Gin Leu 
190 

Glu Phe Ser Ser 
205 

Lys Lys Pro Ser 

Gin Phe Ala Phe 
240 

He Ser Gly Asp 
255 

Arg Met Gly Tyr 
270 

Val Val Leu Ser 
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275 










280 










285 








Trp 


Val 


Ser 


Phe 


Trp 


He 


Asn 


Lys 


Asp 


Ala 


Val 


Pro 


Ala 


Arg 


Thr 


Ser 


290 










295 










300 










Leu 


Gly 


He 


Thr 


Thr 


Val 


Leu 


Thr 


Met 


Thr 


Thr 


Leu 


Ser 


Thr 


He 


Ala 


305 








310 










315 










320 


Arg 


Lys 


Ser 


Leu 


Pro 


Lys 


Val 


Ser 


Tyr 


Val 


Thr 


Ala 


Met 


Asp 


Leu 


Phe 






325 










330 










335 




Val 


Ser 


Val 


Cys 
340 


Phe 


He 


Phe 


Val 


Phe 
345 


Ala 


Ala 


Leu 


Met 


Glu 
350 


Tyr 


Gly 


Thr 


Leu 


His 


Tyr 


Phe 


Thr 


Ser 


Asn 


Gin 


Lys 


Gly 


Lys 


Thr 


Ala 


Thr 


Lys 






355 








360 










365 








Asp 


Arg 


Lys 


Leu 


Lys 


Asn 


Lys 


Ala 


Ser 


Met 


Thr 


Pro 


Gly 


Leu 


His 


Pro 


370 










375 










380 










Gly 


Ser 


Thr 


Leu 


He 


Pro 


Met 


Asn 


Asn 


He 


Ser 


Val 


Pro 


Gin 


Glu 


Asp 


385 










390 










395 










400 


Asp 


Tyr 


Gly Tyr 


Gin 


Cys 


Leu 


Glu 


Gly Lys 


Asp 


Cys 


Ala 


Ser 


Phe 


Phe 






405 










410 










415 




Cys 


Cys 


Phe 


Glu 


Asp 


Cys 


Arg 


Thr 


Gly 


Ser 


Trp 


Arg 


Glu 


Gly 


Arg 


He 




420 










425 










430 






His 


He 


Arg 
435 


He 


Ala 


Lys 


He 


Asp 
440 


Ser 


Tyr 


Ser 


Arg 


He 
445 


Phe 


Phe 


Pro 


Thr 


Ala 
450 


Phe 


Ala 


Leu 


Phe 


Asn 
455 


Leu 


Val 


Tyr 


Trp 


Val 
460 


Gly 


Tyr 


Leu 


Tyr 



Leu 
465 



<210> 3 
<211> 82938 
<212> DNA 
<213> Human 

<400> 3 

gatggctagg cagagaaagc gcctgcttcc 
gttccagctg ccagaggctt tctggttacc 
aaagcagagc ctctgaagtt aggtaccaaa 
ggggagtggg gcagagaact aggattgccc 
gacaacattt ataaaatcag acttctgttt 
tagttcttgc ctggcattat tcattagtgt 
ctgcatatcc tctctcacta ttaaggcggt 
accaggaatg catggagcct gctcattcat 
tctcctgagt gctcatttgt gcggggcatc 
gagacaccgc cccacccaca aaattaacaa 
cgaacctggg gaaaaccaat taggcactgc 
gatttagaat ccgaacgctg gagaggaaag 
gaggaagagc agaaggggag aaaggcagcc 
actcaaacgt ttctccgccg ctgcgaggga 
gagccctttt ccctactggg ctcaatttct 
gattggctca actcttttta agataattta 
ctggctcaat tctgctggga gtcgcatcct 
actctccctt ctgtactcag ccagctgctg 
gaattgagct agtgcagcac acgtaaaaaa 
ctcttctccc cttttcttct gcggagtcaa 
ctgaccctgc atttgggaaa ctggtgagta 
tctacccctt tccccttatt cccctttctt 
ttgcatttcc taacacatat gtggtattta 
cattctcaga tagaaggaaa gtaacatatt 



accacctcca gcgtcttcgc ctgtacagtt 60 
atggcaaccg tcgggctctg ctaggaactc 120 
cactgaggct taacagatgc aaaagcagat 180 
atctaatagg tctacagctc ctatggctat 240 
ttgtgtttgt aagatgagat taagatgaga 300 
catctttcgg tgcccggcgg gagaagaaga 360 
ggaagggggt tggggggagc aaacgtggtg 420 
tcattcattc attcaattat acattcaaca 480 
ctgccaggtg ctggatctgc aaagagaaag 540 
aatgattaac acaaattaac agaatcctca 600 
ctgcttccca cgggtctctc ttggaagaca 660 
aggcagtcgc cccgaaatag agtcagacaa 720 
tctccctagg gagggattct gttcaaggcc 780 
gaagaggttt gtttttcttt tagaaaaagg 840 
agtaggagaa ccccgcctcc ttgggctact 900 
tgccaccatc ctctcggcct gattggctcc 960 
acctgtttgg gaggtgcact gcctttccac 1020 
ctgaggtggg aggaaaagtc ctggctggga 1080 
gcgattccga tgggtccttt gaaagctttt 1140 
agtagagggg tgaggttggt cttcttgtta 1200 
aattcagggg tttaaaacta ttctttgcta 1260 
ttaaaaatta ttccacttta agtaattaat 1320 
ccatcgtttt attttctgat tagatctatt 1380 
tctgaagcat gagattggtt cagttgtttt 1440 
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tgacttcaag 
agggagtgag 
agaactttct 
acagcatgat 
tttggtaact 
aaaacaaatc 
ttaatttgaa 
atgtgtggcc 
tgtgtgtgtg 
cacaaatttc 
taagtcattc 
tagtgggtga 
acatggtact 
aaaaaagtaa 
tgcct ttata 
agctagagtg 
atagagattg 
agtgcttttg 
cttaaatgtt 
tgtcgtaatg 
cacat acatc 
accttttatc 
agcatattat 
tatttgcaat 
gatat gtgct 
aagggccttt 
tatgatttct 
agtcaatggt 
aatactatgg 
aatttttatt 
cattaaaaga 
agttcatgtt 
agctagcaat 
cttcattgaa 
tattagaaca 
ttaaaaggaa 
aacattcttt 
taatatttag 
ctctcccttt 
tttgtaagca 
tggggacttc 
cctcattaat 
ctatgaaaac 
ctctaggaga 
ttctttgtaa 
atattttagc 
aagagctcca 
tcatactatg 
atatattatt 
ttttttaagt 
aaacttgaaa 
tgagaggaaa 
tacagagtct 
aacacttatg 
ttaattgaat 
gagtaaaaaa 
taaatgactt 



cagagtataa 
ggggagtaga 
ggatgattct 
atgtatgatg 
ttctggaaca 
tatcagcatt 
ccagctttga 
acatgaatgt 
tttgtgtgta 
aaatttaaca 
tggtagacat 
cttgagctaa 
cataccatgc 
ctctgtatag 
actatttgta 
aatttttttg 
gattaaagta 
taggactgaa 
tgacacgaac 
gcttattctt 
ataatattac 
attttatacg 
ctatcccaag 
cattgtaatt 
gaagggacag 
aacctctctg 
taagagtttc 
tgcttcactg 
actttttaac 
gtcatagtaa 
tatatttata 
atacatttaa 
aaaaggcaaa 
tggaaaaagt 
tgtttctttc 
ataaagcatg 
taaaataagt 
ttggaggtaa 
ccctttcttt 
ttttagtgtt 
ccaaattgga 
ttccatcaac 
ctatcctatc 
ctctttggca 
tgatgaaaat 
tgttaccaaa 
ggaaaaaatc 
tggcattcag 
ttcaagcaat 
gtctagattt 
agcaaatgac 
atgatttttg 
tttagtttat 
gctaaggaac 
gtaagaatat 
aactgaaaaa 
ccatgaaaga 



actagtgaac 
aatgactttg 
gcattttaaa 
tttttctacc 
aaccagtgtt 
aaaaatgtat 
gtgttttttt 
cttggctctg 
tggggacaga 
gaaacaaaac 
ataaaatgct 
aatacgttga 
aacttgtttc 
aattcaaaag 
tcactgctaa 
ccacaccagc 
ggctaagtaa 
gttttctctc 
ataaacataa 
tgaatatatt 
aaaagtaaaa 
agttttcctt 
agatgtttta 
ggctccagag 
agtgcaggtt 
aacacaggtt 
acattcaata 
ttttatctaa 
catatgtata 
tgtagataac 
agctgttgtt 
aagtaccccc 
tgtaagcatg 
ttgacacaga 
tccattgggt 
cttttgctaa 
aaaataaagt 
cagatgctaa 
tggttagttt 
ttatatgatt 
agcatttttt 
agtatagagg 
attgtaacta 
tcttgacaag 
aatctagtta 
agtaacctac 
acaattttct 
ttgcccggtt 
ttcagtagta 
ttggcttaaa 
taagtaaaga 
aagctaaaat 
ccaaacaatc 
aagaggaaat 
aaggatgaag 
tgactctgga 
aaaggaagaa 



tagtgaacta 
ccagtaaatt 
aacaaagtta 
tactgctatt 
ttacatcaag 
agtatccaag 
ttaaaattgg 
tgtgtgtgtg 
gagagagaga 
tggactggaa 
caggtatcat 
ttaagacata 
aaataaatgt 
taaagtacaa 
aatgaattat 
atatagaaca 
tattagacct 
tctttttttt 
tactggatga 
atagcatttt 
acagatttac 
tcaaacaaat 
ttttaaataa 
ccaattacat 
tagcagcctg 
catctatgaa 
catctatgat 
gtacttttca 
agtgaggaag 
ctggattatt 
ctaattagcc 
aaaaacataa 
ctttctttaa 
aacacttttg 
tctatgagac 
aaaataaatt 
agctacagtt 
catgtataaa 
tcccttccaa 
tccaaaatat 
ttgtatttaa 
agaaacacta 
aaactgtagt 
ttctttagaa 
taatcaggtt 
aataagcctt 
ttttt cagaa 
gaaaatattc 
tgttttgtgt 
tttgtaataa 
tattccatga 
gcaaatgtcg 
agtatctttc 
gtttgatatt 
gtggaggccc 
gatactgaga 
tgagcttcaa 



atgaaggttg 
tactgaatga 
tttcagaaac 
gcttttcccc 
aaatattgta 
taaaaactga 
ctgtccatca 
tgtgtgtgtg 
gagagaaaag 
tacctttcag 
tcttcgatta 
gacacttata 
acttttatca 
tcattatgaa 
tattataaac 
agtatatttt 
gttagaattt 
taaccctgtt 
attatttctt 
ataaaagata 
agtagttttc 
aggcaagctg 
acatgttttg 
caagctccca 
gagacatgat 
atgaaggata 
ttcaactgaa 
agagggaaat 
gtattgtaca 
ttgtccagtg 
cttacggcct 
tttttaagaa 
gatgcatcat 
taggcaaacc 
caagttggga 
taaaaaataa 
gggtgaactt 
tcatgtacga 
aactaaccat 
atcttgaaag 
tcagttgtaa 
ttttattttt 
catggttcct 
tctcttcagc 
tttaaaaaga 
gactaagagt 
caatttccat 
aaataaactt 
gggtagtaat 
aatgttgatg 
tagttaaata 
taattgataa 
ttccattcta 
tcattcaaat 
aaaggagggg 
aaatgattac 
gagagtaatt 



agttactggg 
atgccaccga 
ttgttgttac 
tttcctcgtt 
aatctggcta 
ccgtgtatat 
gtaattaaaa 
tgtgtgtgtg 
tgtcaccgat 
taaaaatgtt 
tacgtaggag 
aaggtctggg 
ttacattatt 
gatagaaaca 
gtaagtgtaa 
ctcttagaaa 
tttgacagtt 
cccttagatt 
gtatatattg 
tttaggtata 
tgaaataaca 
cttagtaaag 
ttttggggaa 
gaattgaaga 
ctcatgtgtc 
ctagaccaca 
acatgaaata 
tctaagtgtg 
gttaacacat 
aataaatata 
cagttatgac 
tgaggtagaa 
atactgtttt 
atgagaatgt 
aatatagcaa 
tgttgcaatg 
tttatccttt 
tcatatttct 
aaatagttca 
tcatctattt 
tgcttagttt 
tattgctgaa 
gtgtatcatt 
ttgatttaac 
atttaaaaag 
tatacattga 
gagtcaataa 
gctttttaaa 
gaatgaataa 
ataaccattc 
tgcaaaagac 
aacccctgtg 
tactatgaag 
attaaataga 
tgaaaagaaa 
caaaatatga 
cacatatggg 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
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caggctttca 
accatttctc 
ggagttgtcg 
actagcactt 
ttattctatt 
ggatccttgt 
atgct tattc 
atcaaaatta 
aagatatact 
ctcat cagca 
ctcctagaag 
cgatattgtt 
cttattaaag 
tacccctgcc 
aacaatgaac 
agaagagaga 
tctactttgt 
agaaaatttt 
ctgtaactta 
tcttatttta 
gacaaaccat 
ttatctaaca 
gcgtt ctcag 
tcattaagtc 
agcct tagct 
tattt actca 
aatggtaaat 
attatatgct 
tagtagtttg 
attgctactc 
gcaactctgg 
gagggaaaga 
gatagataga 
atgttctgtt 
tatatacaca 
tatacacaca 
cacat gtata 
tatagataca 
tatatgtata 
catatgtata 
catgtataca 
tatgtatatg 
tatacatgtg 
catatgtgta 
gagagagaga 
tcacagtttg 
tctggttgtg 
gtttaatttt 
gatttgacat 
tgtgctatgc 
gctagaatat 
atagtcatgt 
gttctagcca 
ttaggtacag 
ataattataa 
ttccagaaaa 
gtattaattc 



agtgagaatc 
actggagtga 
tatgcatttc 
aaaattttat 
atgatatagc 
gtgttcttaa 
atggctaaga 
aaaatacacc 
gaaagaatgg 
ataccagaag 
attatactta 
attctctcaa 
gtcatcttgc 
tgcattttta 
attagtgttg 
tactacagta 
acctatcaca 
gacacactga 
aaagcaaaat 
aaggatgcta 
aaccaagtag 
tggtaatgga 
gctctgtagg 
tatcagtact 
attaattcat 
gtgcccattg 
aagacaaagt 
agatgaacag 
tgtcaagata 
agaattctgg 
gaaaatattg 
tatatataca 
tagatctata 
agggggttga 
cacacacgta 
catatgtata 
catatgtata 
catgtataca 
tagatacaca 
tatgtatata 
tatgtatata 
tgtatatgta 
tatatatata 
tatatatata 
gagagagaga 
gtagaatttt 
taagatgtgg 
gaaatcctga 
ttttccttat 
aatagagata 
aggatcagaa 
caagatgtga 
ctaggttaat 
tatgcttaag 
tattaatttt 
gagccatatt 
agatgcctgt 



ctggccttta 
tttcagacaa 
cagaaatgaa 
atgaaaagaa 
ctcgctaaag 
atgtctttta 
cctagattta 
tctagaactc 
atcataatca 
actatgggga 
gccaaattac 
ttttacagat 
tgataggtgc 
actaccacat 
ataaattttt 
aattatgagt 
cactttctgg 
agtacttgat 
tttttaattt 
tatataatca 
tagaaattat 
tagcttatca 
aggtagatta 
aaaataattt 
tcagtaaata 
tgtgttaagc 
cattgcccta 
gatacagaaa 
gcatattaac 
aggaggtaca 
tatgttctca 
tctagatagg 
tgtaatatat 
ggataagggt 
tatatgtata 
catacacaca 
tatgtatata 
tatgtatata 
tgtatacata 
gatacacatg 
tgtatataca 
tatatgtata 
cagctatata 
tagagagaga 
gacagagata 
ttgtatatag 
cattaaaata 
aatgtagaaa 
ttgttagaca 
cagaagtaaa 
agtttatgat 
tttgtttctc 
ctttaggcag 
tacagggata 
ccagcacgtg 
tactagcact 
attttataaa 



ccaacagatc 
ctccttgagg 
atttccaaga 
agtcataaca 
tggttttaag 
agtcttttaa 
tgacactaat 
taaagataaa 
aggaagaaga 
aaatgtcttc 
ctaatgagaa 
ggaaaaacag 
tgaaaccgaa 
ctctcagatt 
taaaggaaat 
tcattgataa 
agtattgtgg 
gcagacaaac 
acataagttt 
tgtttgaatt 
atgaaggaag 
ctagacatgt 
aatttatagt 
agccagctga 
ctcactgagc 
agggttctgg 
tagtgttcaa 
tcagaattcc 
ataaacaagt 
gtgggagcaa 
gctgtcatag 
taggtagata 
atccttacat 
aaggcatatt 
tatacacaca 
tgtatacata 
gatacacatg 
tgtatataga 
tgtatatatg 
tatacatatg 
tgtatgtata 
tgtatataca 
cgtatataca 
gagagagaga 
acattaccat 
atttcctata 
gaaacatcct 
tcttttcctc 
aatattaagt 
cttaatatag 
tataaaatag 
tagatggctt 
atttctagaa 
tcattcacct 
taatagagtg 
gagtagccct 
ctaatttgtt 



taaattccag 
cagaaaatct 
aagaaaatct 
gatattctgt 
tgcagtagag 
atgtctttta 
taaataaaaa 
cagaggagcc 
actaaagagc 
aaatctgaag 
gaagaagaag 
gggaaaaaca 
acttaaatcc 
atgaatgcaa 
ggagaaaaaa 
aatcacatta 
gtagtgccag 
acacacacac 
actgaggaat 
tattatgtat 
attttctccc 
ttaagcagaa 
tttttttatt 
agtcccatgt 
actaccgttc 
aaactagata 
ctagcataaa 
tgctcaatga 
aaacacaata 
taaagaagtt 
gcataagttc 
gatagataga 
ctgaggaaag 
atactgtagc 
tatgtataca 
tgtatatatg 
tatacatatg 
tacacatgta 
tatatagata 
tatatatgta 
tatgtatatg 
tgtatgtata 
tatatatgta 
caaagagaga 
cttgtgccaa 
tattttaggt 
attgaaggct 
ccagaaacaa 
gactaaaaag 
accctgctct 
tgagtttaat 
taagactgct 
caatcactaa 
cttgatcatc 
tgttgagtaa 
gttggctgga 
accctctgca 



tttcagctcc 
tcattagtgt 
gtgcaaagaa 
ggaattatag 
tgaaatatat 
aatgcagctt 
atatatttta 
taaatttttg 
caagagattt 
tcaagtattc 
aatatagcct 
tttttgtgaa 
ctgtgttcta 
tataagtcaa 
agaggagaag 
agcagaaaaa 
attattttaa 
acacacacaa 
tgaatggtta 
agatgtagta 
aatacaaaag 
acttgttagt 
ttcaaacatt 
ttcaagaaat 
attccaccaa 
taatggtaat 
taagacatat 
acataaactc 
aaatgctgta 
gccatcagtt 
aggtcagtat 
tagatagata 
ggtaggctat 
tgtatatata 
tatgtatata 
tatatagata 
tatatatgta 
tacatatgta 
cacatgtata 
tatagataca 
tatatacatg 
tatacagcta 
tatatgtata 
cagagagaga 
aggtagagtc 
gaacctttac 
atataattta 
tgtattctga 
taatagagat 
ccaacctata 
atatatttta 
agaatcaagg 
tggttgtaca 
ctagatgggt 
agaatacctg 
ttgaacttaa 
ggtggcttat 



4920 
4980 
5040 
5100 
5160 
5220 
5230 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
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gaatgtatat 
tcatgactag 
gtttttgaag 
cagatcccaa 
tctttcttcc 
gcctccatgt 
aaagtctcag 
gttctgtcag 
tattgtctat 
cttactccaa 
atttaccagc 
ttgaggaaca 
agttggcaag 
aagctacaaa 
tgtctttttt 
tctgatatta 
aataaactat 
tatacagtgc 
agagatctat 
acattttact 
aatcttattc 
tttactcaac 
ttgaaacata 
gtaaggtggg 
ttacggaata 
tgggatacat 
attgctacaa 
gaaaaagtcc 
gttctactta 
cctgtattaa 
tatttacttt 
cacgtaagac 
tcattgtcaa 
atcagaaagt 
aaattaagga 
aaatagcaca 
ttaaattggg 
atacctttgg 
gacggtcata 
ttaaccacag 
tctcttctgt 
actgtagttt 
tctttaaaac 
gtttaaatct 
ttgtacagtg 
tttagtagct 
gtctaaaatc 
gagtatattg 
actgtgttta 
caacagatca 
ttgaaaccct 
gataaaacac 
aataagcata 
atataattta 
ttaaattaga 
gaattgtaat 
catgggggca 



ttgtctctgg 
agtagactag 
tgaatagtct 
ggcttagaaa 
acttttaagg 
tgcaatattt 
agagtgtgaa 
gcactatgaa 
tcagtgttca 
aaagcactca 
cttctgaaga 
aagttggcct 
ctatactgtt 
aatttcaaat 
atcctctttc 
ctcttttttc 
gctctgatat 
taaaagcaat 
atcttcaatc 
tctttctcct 
atgctctggt 
tttatttttc 
agaaaaaaat 
aatctaattt 
attcttgtat 
ctatatattt 
tgaggtttta 
atctaagaat 
attttaatgg 
atgcatcaaa 
gtcttttttt 
atttaccttc 
atgagtaagt 
tagtatgttt 
ccacagttat 
ataggattta 
cacaaactat 
ctctaaggta 
aataaataag 
gacagattgg 
catttcaagt 
cctttgtgct 
tctggaggtt 
ataatgctgt 
aaaaattatt 
caaacatgaa 
agtgattttt 
catcttctac 
tgaaaatgtg 
cagatagggc 
gtgtgtgaca 
cttctaccat 
aatatattaa 
atgggttaat 
gctatggtga 
ccccataatc 
gttaccccta 



aatttcttag 
actgttttgc 
aagacaatct 
aatatatatg 
tcttttaata 
ttagcagcct 
tgtcaccacg 
aatgaactta 
ttttttaaat 
catagaaatg 
tttcattgga 
gctaaacata 
ttatttgtag 
ctataaccaa 
taaaaacatg 
attagctatc 
gattggcaat 
ttaaattgct 
ctaccaagga 
aaatcatttt 
ctgtagacag 
tttacttgct 
gattttccct 
tgttttcttc 
ttctaaactg 
gggatcttat 
cttacagaaa 
tcccttttta 
ttttaacctg 
aatttaggaa 
ttctgtcctt 
cctgttatag 
ttttttcttc 
tttaaattag 
agttttagtt 
aatttttttt 
gtatcagatt 
tttaaatacc 
taattaatta 
aagcatacag 
tttcaaagca 
cataaaataa 
tagtgcttgc 
tatttatcac 
cagagctaca 
ggcaaaaaag 
cctacccttt 
tcagaagata 
tttgatggtt 
ctgtggagac 
gatgctgttc 
tttggagttg 
attggaactt 
actatttcat 
tacggttagg 
cccaagtgtc 
tgctgttctc 



taacaattca 
tttttctatt 
tttgtcaatt 
tttctgcctt 
tgttttataa 
tgaagaatgt 
atttgtattt 
aaaatggaga 
agactataat 
gtcacaaaat 
tgggtggttt 
gcaaataaac 
tatttattct 
aggcattaag 
ccaccgtacc 
ttatgtttaa 
tgtccgactc 
atttgtttcc 
aaccatataa 
ttcacatctc 
aaaattaatt 
ttttgaaaat 
aatgccatgg 
taaaagagct 
ataattcact 
tttgaaagct 
ctttattttg 
aataatatct 
ttacacacaa 
tcatacatta 
atgttcctca 
ttccgcttaa 
tgtttttatt 
ctttatgtta 
atgttcttga 
tctgttcact 
tgggagtagg 
taaatacctt 
agatacagta 
gacacactta 
tctctctatg 
tcatagtatt 
ttcagctgca 
tcttttactt 
accctagaaa 
agagaaaatt 
gaaaagctag 
cgtttctttt 
gtaaataaat 
aacatatatt 
tagaaacttg 
gcactcagga 
ataaaattat 
cctttatcat 
cttcatgtcc 
aagagagaga 
atgatagtga 



tttccaaaat 
cacttatttg 
aaaatatttg 
cagaggtatg 
tttaataagc 
atctgaagca 
tttttctaag 
actcaatgat 
aaaaattaca 
aacatttaca 
gtagttttct 
aacaaagaca 
tttattaaaa 
aaaaaaatct 
aattaatttt 
tcttatattg 
atttttgaaa 
tattttagat 
aatagactat 
aaataaaatc 
ttactttaca 
attttcaatt 
tatatataac 
aattagcatt 
tttatcatat 
taattttgtt 
ttttcttttt 
tataaattct 
aaggtgagat 
acaatatttt 
ataaactctt 
tcatttcata 
tttgatgaat 
ggttatctac 
ttcatggtaa 
tacttattca 
tgctggaggt 
caaattgctt 
t gattaactg 
cctagtattc 
tatgttaagt 
gagtggatta 
catataataa 
attttgcttt 
ggaacttttc 
tatttacttt 
ttcttacagt 
ctggaacttt 
actacctaac 
tatttactaa 
tgatatgtcc 
gggagaaata 
ctttttgtcc 
ttatgtatag 
cccccgccaa 
ccaggtgaat 
gttctcagga 



catcagattg 
tattctctta 
acactcattt 
aattccgatt 
tcaatataat 
tagatatccc 
tctaacttat 
aatttcaata 
ttttctttct 
atttctgggc 
acactctgct 
aaaatctaaa 
atcaatcaaa 
gttttataat 
ttaaatgtat 
attctaaata 
gtaacagaga 
gtaattttga 
aaatgaagcc 
attgatgtat 
gtttgagttt 
gttgctgttg 
aatttactgt 
tcaacacaac 
attgtatttt 
ttcactattt 
attcttgcca 
catgttgatt 
tctgattaaa 
ataggcaaaa 
tagttttctt 
tgctatgttg 
ttgtattggt 
cataattttt 
ttttatctga 
ttgattttta 
atcacattac 
aaagcttaga 
catggtagag 
taggcttggt 
gataggttcc 
tcaattctcc 
aattctgaaa 
tctatagctt 
tacaaaaggg 
ttgattttag 
tcttgacata 
tccctgaaat 
ttgtcatgaa 
agaatattca 
atgaactaga 
cacagtaaac 
aaatgggcaa 
gattttataa 
atctcatctt 
gtaattgaat 
gatttgatgg 



8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10560 

10620 

10680 

10740 

10800 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 

11640 

11700 
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ttttataagg ggctcttccc ccttctctca gcacctttcc ttcttgttgc cttgtgagga 11760 
aggtaccttg cttccccttt gccttcacca tgattgtaag tttactgaag ccttcccagt 11820 
tatactgaac tgtgagtcaa ttaaacctct tttctttata aattactcag tcttgggcag 11880 
ttctttatag cagtataaaa atggactaat acatagagaa acaaataagc cgggcaagga 11940 
ggtgtggaat ttggatagaa gagcaaatag tgatcagaca ggcttcaatg attaaataat 12000 
atttgaacaa atactataag tgagtgaagc tttcagtcat gccaagtgag ggttggggag 12060 
ataagaattt gaagtggaag accataatgt gggggccttg tagttacttg tagccgtgta 12120 
gctatttttt tttcaacttt tattttaaat tcaagggatg catgtgcaga ttttttacct 12180 
gtgtatattc caggatgctg agctttgaag taggaataat cccatcagcg atatacgaat 12240 
catagtaccc aatggttagt ttttcaaccc ttacctcata cctccttgcc ctctctaata 12300 
gtccccattg ccttttatgt tcatgagcac ccaatgttag gctcacactt acaaatgaga 12360 
acatggggta tttgcttttc tgttctgcaa aaaacacaat tttgttgttt tttaatggct 12420 
tttttatggc tatgtagtgt ttatgtacct cattttcttt atccaatcca ccactaatgg 12480 
gcacctatca gttgattcta tcactttgct gttgtgaaca gtgctgcatg gaacatgtga 12540 
gtgcatgtgt ctttttggta gaattatttg ttttcttttg gatatatacc cagtaatggg 12600 
attgctgggt tgaatgatag ttctgtttta agttctttga gaaatctcaa aactgctttc 12660 
cacattggca gaactatttt gcattctcat caacagtgta taagcatttc cttttctcca 12720 
cagcctagtc tatatctgtt gctttttggc ttttcaataa tagtcattct gattggtgta 12780 
ggatatcttg ttgtggtttt gatttgcatt tctctgacgt ttagaggtgt ggaatatgtt 12840 
ccctatgttt gctggccacc tctatgtctt cttttgaaaa atgtctgttc atgtattttc 12900 
ctacttttag tggggtttta ttttgttgga ttttcaattg tttaagttcc ctgtggattc 12960 
tagatattag atctttgtca gatgcacagt ttgcaaatat tttctctcat tatgtaagtt 13020 
atctgtttac tccttgacag ttgtttttgc tctgcagaag ctcttcagtt taataaggtc 13080 
tcacatgtcc atttttgttt ttgttgcaat ttcttttgag gactgagcta taattgcttt 13140 
cctaaagctg acatccagaa tggtgtttcc taggttttct tgtaggatta ttacagtgtg 13200 
aagttttaca tttaaacctt taatctatca tgagtcaatt tttgtatgtg gtgcaagtta 13260 
agggtcccat ttcagtcttc ttcgtatggc tagccaacta ttccagcacc atttattgaa 13320 
tagagagtcc ttttctcatt gcttattttt gccaaatttg ttgaagatca gatgtctgta 13380 
ggtgtgcagt tttatttctg ggctctcttc tgttccattt gcctatgtgt ctgtttttgt 13440 
accagtacca tgctgttttg gttactgcag ccttttcata tagtttaaaa taagataatg 13500 
tgatgattct ggctttgttc ttttttcttg gagttgcttt ggctatttgg gctccttttt 13560 
tgttccatat gaattttaga atagtttttc ctagttctat ggaaaactct tttgttagct 13620 
taataggaat agcattgaat gtgtaaattg ctttgtccga tatggccttt ttaataatat 13680 
tgtttattcc aatccatgaa cgtgaaatgt tttgcatttc tttgagtcac ctttgaattg 13740 
ttttagcagt gttttgtagt tctgcttgta gagatctttc agctccttgg ttacatgttt 13800 
tcctaggtat tatttttgtg gctattgtaa atgggattac attcttgatt tgactcctag 13860 
attgaacatt attttcgtat agaaattcta ctgagtttta aacattgatt ttgtgtcctg 13920 
aattacaatg aagtggttta tcagttccag gagccttttg gtggagttct ttgggatttt 13980 
cctggtatag aatcacacac tttctaaaaa gagatagttt gatttcttct ttttctgttc 14040 
gatgcctttt atttctttct cttgccatat tgctctggct agcatttcta gtcttacgtt 14100 
gaataggagt ggtaggagtg gccatccttg tcctgtttca gttatcaagg gtggtgcttc 14160 
caccttttga ccatttagtt tgatgttggc tgtgatttgt catagatgac tcttattatt 14220 
ttgaggtatg ttccttcaag tcttagtttc ttgagggttt ttatcatgaa aggatgttgg 14280 
attatatcta aagctttttc cacgtctatt gagatgatta catgtttttt ttaattatgc 14340 
tcatatgatg aatcacattt attcatttgc atatggtgaa ccaaccttat atcccaggaa 14400 
tagaacctac ttgattgtga tgaattaact ttttgatgta ctgttgggtt cagtttgcct 14460 
tatttatttt gagggttttt gcatctatgt tcatcaagga tatttgcctg tagttttctt 14520 
ttttcattgt ctttggtggg ttttggtatt agggtggtgc tggcttttta gaattagtta 14580 
gagaggactt aaaaggcaca gagtggcagg ttggatagat taacaagacc catccatctg 14640 
ctgtcttcaa gacacctgtc tcagaggtag taatgcccat cagctcaaaa taaagtttag 14700 
ataaagtttt aatgggcaaa cagaaagcaa aaagagcagg gttcactatg ttatatcaga 14760 
taaaatgaca ttaaaccatc aacagtaaaa atggacaaag gagggcattg cataattata 14820 
aagggtgcaa ttcaacaaga agactcagtt atcttaaatt tatgtgcacc caacattgga 14880 
gcacctacat ttaaagatat acatctagaa ctacaaaaac atttagacag tcacccaatc 14940 
atagttgaag atttcaacat cccactgaaa gcattagcaa gattgaagga gaaaagtaaa 15000 
gtaaaaaaga aattctgtaa ctaaatttga cacttgacca attggaacta gcagatatct 15060 
gtagaacact cttcccatca accagagaat atacattttt ctcatctgca catggaacct 15120 
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actccaagat caaccacatg ctcagccata aagcacatct caacaaattc aaaaaggtca 15180 
aaatcatacc acctatagtc ttggaccaca gtggaataaa aatagatatc aataccaaga 15240 
agatctcttg gaaccataca attatatgga aattaaacaa cttgccccag aatgcctttt 15300 
gggtaaacaa tgaaagtaag ggataaatca aaaaattatt tgaaataaat gaagacagag 15360 
acacaatata ctaaagtctc tgttatgctg caaaagcagt gttaagaaga aagtttatat 15420 
gctaactgtg ggcctgaaaa ggttaaaata atctcaattt aattatgtaa catcgcacct 15480 
ggggaaacta gaaaaataag aacaaactaa ccacaaaact agcagaagaa aagaagtaac 15540 
taatatcaga acagaactga atgtaatgga gacacaaaaa tttgtacaaa gaatcagtga 15600 
aaccttttct aaaatgataa acaagattga tggactgaaa gctagattaa caaagaaaaa 15660 
gagagagaga agatccaaat aagcacaatc agaaatggca atggtgacat tacaactaat 15720 
cccacagaga tataaaagat cctctgatac tattatggat acctttatgc acacaaacta 15780 
gaaaatctgg agcaaatgca taaattcctg gaaacacaca atcttccaag attgaatcag 15840 
taagaaatta aaatcctgaa cagaccatca ctgagtttca aaattgaatc agtattaaaa 15900 
aaaaatacta accaaaaaaa agccctgggc cagatggatt cacagccaaa ttctactagg 15960 
catacaaaga aaacctggta ccaattctac tgaaagtatt ctaaaaactt gaggagacac 16020 
tgtttatcct aagagatatg gaaaaccctc caactaagat ccagggtttt gagcatagat 16080 
atgatatgac aagatctgta tattaaaaac atcactatgg gtgttgtatt gataatagac 16140 
tatgttatca atgcaagggc agaattaaag aaacccaaag gagacttttt aaaataattt 16200 
gagtcttcct gatgaatttt acctgtgata aattcttcct ttgccactta ctaatagtaa 16260 
gaactttgga aagttattta aatctctcag aaactgagtg atcttcactt ataaaattag 16320 
ggtaaaatgc ctaccttata aaatccacat tcatgttttc attcatacaa caaataataa 16380 
ttgagtgcta acttagtgcc catcacttct ctagtcacta gagatgcatc cataaacaaa 16440 
aaaaaaatcc cacgctcatg gagagtggta aggatagtgg taaaggactc cacactcgtg 16500 
gatagtggta aggattaata tgttcagaat cctagcacaa aacttctgtc attagatcat 16560 
tgtgagtgag tgaatttctt tagttttgaa caaattttaa gattcttgag aacagaaagt 16620 
caacattttg ctcctacttg actgatgaat gagtggactc tattaagcaa agcaaatgtg 16680 
aattgaacaa aatcaatttc tagttcaagt ggctgtgcaa aacactactt ctcacacgga 16740 
gatttagtga catataaact gaatgtgtgt gccaaaaatt tactaagcag aattttttca 16800 
tatttggctt gaccctgaca ggccataatg cttgtttcct ggcttaagaa atgtgcatga 16860 
aaagagtatc aattcatgtt cccttctagc taaaaattcc ccaatactat atgtttctta 16920 
gaattttttc attccatact tattaccatc atttatacag aatatttccc aagtatttcc 16980 
aatttactaa tataataatt tttctgtatt tctaatgttc cttctgacaa atatttaaaa 17040 
tgtgcagaat gtgctaggaa acctattgat ttaccttccc ttttttgtca tatgtcctac 17100 
tagcatagat ttttgtctat cacatataca taaacatttt accaaataca taagtaaata 17160 
cttattctat tattgatcct tctttttata gtcgttatgt caccacttgt agaatagtat 17220 
tctgataata gtctaatttt aaaaataatc atatgtataa agtattttaa atgtcaaagc 17280 
tttgagtaga agtattgttc agttggaaag tttgtagaac atgagattgt tgtagagtaa 17340 
caacaagtta acagaatatt tgaagaaaat aaattaaaat aagataggat ttttgatatc 17400 
tccaaaggca atgtcaagag aatcatcatt tctatttaat gattcaaaat tatataattt 17460 
agtatcacca tcataattca catatgagaa aaaaagagag ggagatagga agcaataaga 17520 
aaaagaagga gggtgggaaa tgaggtggga ggaatggtga cagggaggaa gaagaagaga 17580 
aggaaaggaa acaagaaagg aaggaactga gtaacaatga ggatggtgta catgtacatt 17640 
tgtgtgtgta gaatttagaa tgactgcatt aaaatattaa atagtaagat gtataatcat 17700 
gttaaaaata aagcattgtg tacattatta taaggtatta ccttaggaac atatatatta 17760 
ttcaatttat atgatttcat aaaacttttg ctgataaatg gcaaactata atgagtttct 17820 
agttaatgac aaatgaaaag aaaataatat tcaagaggag agaaaagatc aatagacaat 17880 
tgtaaaacca cataatgggt acaaaaaatg gaatgaatga ataaaaccta ctacttcata 17940 
gcacaatagg gtaactgtag tcaacataac ttaattgtat attttaaaat aacttgaaga 18000 
atataattgg attatttgta actcaaagga taaatgcttc aggggatgaa tacaccattc 18060 
tccatgctgt gcttatttca catcgcatgc ctgtatctaa acatctcaca ttccccttaa 18120 
atatatacat ctactttgta cccacaattt ttttaattaa aaaaatgatt aaatgtagtt 18180 
gcgtttggag ctggaaagat gaattaggtg accatgcatg acagattctt cctcactcat 18240 
gattcttttt catatcaagc ttaatgaccc agcaaaaaaa aaatttacag taacctccac 18300 
atgttttatt tcatatacca aaaaaatgac tgaaaatgaa ctgtttctaa agtcaggata 18360 
tgttacaagg aaaggaggtg ttaggtgaga ggagttcata tacctaatta ttgaatttaa 18420 
catattatcc atcctgagct taattcagtt gagcagagta gaaccaagca ttcattgtgt 18480 
tcaagctatg agcgtttatg tacatttgag ttattaacag tgctcttgct gtattatgtt 18540 
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ttgaggaagt aaatctgctc atttcctacc aatgcaaaat aaaacaatgg caaaatgtct 18600 
ttgcttttct aactctttgt caaagtcttt tgttgtataa gatttttttc attttaaatt 18660 
acaaaattgc acttagacaa accaaacaaa ccaatgtaaa tacataaatg actaattatg 18720 
gggtatcccc tgggacccag tgaaggactg atcaaccaaa catcaggaaa gctggggatg 18780 
gggcagtctt ggtgacagaa attccttctc tttctctagc tatgccaacg ggctgcacta 18840 
caaattagct attgctgcat ttattgatta aaattctcca ctgtcaggaa tgaaagtctg 18900 
gctaaccttt tggggctcag gtgtctaccc tattctaaac aacttggcct gagaaaggac 18960 
ctgggagtat aaacagacct gcttaagcag cctgcctgtt gattgctttc ataccaggtc 19020 
acagtgtagt tcagtaggtg gtcatactca aatgtgccta taaaactaga catttcagga 19080 
aggagagtaa agagctactg gaacctggag aagaaaataa ttctagccat ctaattggtt 19140 
gaaataggac tccttcactt gttttgcttt ttgatgataa aaaacaatca atacaaagga 19200 
aagtgcatac accgactaac ttgtagagaa tatctatcat ctatctatct atctatctat 19260 
ctatctatct atctatctat ctatctatca tctatctacc catccatcca tctatctatc 19320 
taactgctta cctatttatc tatacaggta agtaatttta gaaaattgga ggattatgtg 19380 
tctggagata gtgtgatgtt gatccaaccc atattggaca ttcaaattaa ttatggtggt 19440 
ataaatacag ttacaggtga atcatgttat gtagtgtaat aacaattttt ctgaacttaa 19500 
aaatttttcc ctcaagtaaa taattgtcct tttaaaacat cccctttgtt ttgcatgtac 19560 
tagtaattta aaataatttc caaattttct ccaaatctcc taaatgtaca ggcagccaag 19620 
atatgaagtt atagggctat tcaaagtgct gggactgtct tctaccacat tacaaaaaaa 19680 
cagtagtgtt tatatttgtc tgtatcattc ttccatcagc ctcagcaaac tgacgtttgc 19740 
ttcattttct tttctaagcc agtttggttt tttcgttttc tttttttctt tttcacttta 19800 
attttctttc gcaagtcagt tttagagacc acccacctct gccaatggtg catctgtttt 19860 
ttgatcttgt atcaataatt tgcccaggcc ttaattagca agtatcattc tgccagcttc 19920 
cttctctttg tttgtgcttt ggatttccga agtctccttg cttataaaaa atggagtttg 19980 
ggaatctggt tttctatctc cttgctactt ttggttgatt tctgaaaaag gaaggatagc 20040 
gtgcccactt taccgttttc aagaggaagt tctgaattac cttttataaa tccagtaatg 20100 
tttaatatat attcattacc taagggttta cagaaatata aaccttaaaa gtttgatgag 20160 
gaaactgtgg ctcaaagatt tgtaaggatt tgtacaaatc ccacattacc attaaaagat 20220 
taatactaat acttaaggct tctattttcc atgagctttc ttctgggggc tgctttagtc 20280 
atctttgcat ccccagcatc cagaaatgct ggatgctcag cacacaataa gcacctaata 20340 
gacatttgtc aaatgaagtt taataaaata attgttataa ttgtttaaag agtgttaaaa 20400 
tcccaactag aatgtagttt acataaggca aaataaccaa tacatataaa taggtgtttt 20460 
attttataat aattacagga atgtaatgca atgaaaatta gatactattt catatccaat 20520 
agattggaaa aagggattat gtgactacac gtaataatgt attatggcta tatgtaacgt 20580 
tgtgtcattc aaggatacat aggtatatgc taaaatgtaa taactaacta agattagtta 20640 
atgtaacaca ttagttatta catttacaac aacacggcaa taaattcatc ttaacattta 20700 
taatatctga ggctggagaa gaatgtagag cgttttgttt cttaaactgt gtattggttt 20760 
catagacagt tgatgattac atcattatct actttttaag agtttctaaa atacttcata 20820 
acttcctttt actacaatct ctatgtatgt tttgagtagg agtcatttct ctaagtaact 20880 
gaaatataat gtggtatgat cattgataca tttgaatcat cataatcatt tcacatgtga 20940 
atataataat tatatgaaat ttttggcact aagtagcttg gaggccttaa tgaagtcatt 21000 
tatgtgttta gaactccagt ttccttattt tttaatataa ggagtaggac atgaagatat 21060 
ctaaacgtgc tcttacttca aacactctat gttcctagaa atgcttagtg taacgtatat 21120 
agaactttgg tagagattac atgcagctaa tcataacttg aatttggctt ctgattgcta 21180 
tataaaaata tataaaatat cagttttgat aaggtaccta tactttttgt ttcattctaa 21240 
ccacttatgc tttttctcag ctacctaaat gagcactaat agggatagca aaaccaccca 21300 
gtcaattgtg tttattagta tgaaataata ggagtagcaa aatgacactc aaaattaact 21360 
ctagtcaatg acattcaaat gacaagaata atagtggtag tcatgatttc tgaacatatt 21420 
tatttatttt ttaattataa caggcaaaaa tttggatttc ttgattgagt atatctacct 21480 
attatctgga ttttatatta attcactgat taatttgcta ttaaaatatt gttagcccca 21540 
cttttaaaaa atagggctat gataacctca aatggtaaca atatgaataa ccaaatttta 21600 
tatttatttt aataaacatg gtggaacttt tccctgtgtg gaaataatga ccaactaact 21660 
aacccagaag tttttatgag gaagtaccag tgacaaatat ttattatatt tactatccag 21720 
tgaatttttt ctgtacatat actcagtaca agtagcaatt atcatttttt tccaatgaag 21780 
tatggattcc aatcaatcaa tttcacttgc aattatgttt ttttcacact gtggggaaaa 21840 
tagtctatca ctatttcttc agggtttact cacattcctc aggcacctta ttaagctact 21900 
ttgagtttca tctctggttt cttaacagat taatggaaga agctccctcc cgtgtcagaa 21960 
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ccaggaaatt 
tccaaagtag 
tgtaaaccaa 
gtttatatta 
gattttcttc 
tattattttg 
ccccacatac 
gtttagtaag 
taaataagta 
tggccttaga 
gcaagagagg 
gtagt ttgga 
gtaggaatcc 
ctgaattaat 
tctggtgcag 
agaaccaagt 
ctacctctct 
agcgt tatct 
tcctt ttcat 
tgcaaaaata 
gtttttccat 
tttctgttat 
ttaacaagca 
gttaaaaaag 
ttctgtgtca 
aagattataa 
attatctcta 
gtcatatagt 
gactacccta 
gcggtcttgt 
atgttgtgcc 
attttaaatt 
ttttctcatt 
aaaaatcata 
atctcccagc 
aattatgaag 
gaacattctg 
gaaatttctt 
tcaattttca 
tgtggatttt 
tttttttttt 
tgtttatatg 
ttagaggtga 
ggaaataaaa 
tttactttta 
gggggcttgt 
tttttcctga 
gctccccctc 
tgtgacaact 
aataggaaat 
tagagcaaaa 
tccaggaatc 
agaagtacag 
ttatagtgtg 
gttacattgg 
aatataataa 
agaaagcttt 



tgcaataacc 
gccattttgt 
gattttcctt 
attgactgac 
ataattttat 
aaactcttgt 
agtaatttct 
acaaaaagaa 
caggttgcaa 
cttgtaagag 
cttcaggtaa 
tcaaaagaat 
tccaacacaa 
attcctacta 
actcagggct 
ggcaaatttt 
ctaaaatact 
ggtatataag 
ttatttacat 
aattttctat 
caagtctttt 
ttttcatgta 
atagatgtgt 
ttatataatc 
ttggattgaa 
tttcatttgt 
tagatggaca 
gtgcgactcc 
gccacacaca 
gaattaggaa 
aggaaaactt 
ctgggtctat 
tttaaggtag 
cctgtttctt 
ttccaaaaaa 
tactttttaa 
agcataactt 
ttatgttatt 
tattacctct 
agattatcaa 
ttctaaaatc 
tgcagtggct 
atgaaaacct 
cattccagaa 
ttttaagttc 
tttacagctt 
tcctttccct 
catttgtcga 
cagtatttct 
tatttgaagg 
taattactgt 
tgcatagggt 
gctcaaagta 
ctcattgcat 
tcaagttaat 
tagtagaagg 
cttgaggttg 



tttatatgaa 
gatgagaagc 
ggcaaacctc 
ttgctatagg 
ggcaaagcat 
ttgttatgta 
tgtcatcaaa 
ataataatca 
ttgaatgtga 
agaaatgctt 
caaagatgtt 
ctgagtgcaa 
attgcccttc 
tgttcagtta 
catccttagt 
cagggatgcc 
ctatttagtt 
aagaagttca 
tatacaatgt 
tgtctcacta 
aagctacttt 
atattaaatt 
actttttgtt 
ataggtaata 
aatgttttct 
ctttgtattc 
aaaacactgg 
ttgagtttat 
atccaaatgt 
attcaaaacc 
taaggaagca 
tgtttaatag 
gtatattaat 
ccaaataaat 
catcaggaca 
tcaccaagat 
agatagctta 
gccgaaactg 
ctactacctg 
aatattgcca 
ttccatttga 
tgccataccc 
aaaatttaag 
gcacacacca 
aggggtacat 
atttcatctc 
cctcccaccc 
tgtgttctca 
aattcctcag 
aaatcatata 
aatgatgtat 
tataagtgaa 
tttttactca 
actgcgggta 
ccacttctca 
accgtgggtt 
catacaaagc 



ctaaatacag 
tgctattaaa 
aatgaatgtt 
catatatttt 
gccttgtctt 
gaagagtaat 
gaaatttgta 
tatactaaat 
gaagaagtct 
gggaaacata 
tgaaacgtat 
tacccttcta 
agaggggctc 
ttgacagaaa 
actggcaggc 
acaataattt 
attcatttta 
ttcttaggta 
gtacatgaag 
gtatagaatg 
atacatagat 
atgaaaataa 
aataaatcat 
ccgtgataag 
cctttattag 
ctagcactta 
gtcagagatt 
gattctacct 
gtctgctaat 
gaaccaccct 
gaaaatatgt 
ctgtgtgtcc 
ttttaataaa 
atttatccaa 
tgaaacacac 
aacaccaatg 
gagaaaaggt 
aggaaagata 
agaaaagggc 
ccataaaaaa 
ttgcaatcat 
accagatact 
ttgttgccaa 
gcagagattc 
gtaaaggttt 
ccaggtatta 
tccccactct 
tcatttagct 
cttatatgta 
acaaaatttc 
gtaagtaaaa 
actttgtata 
tatataaacg 
tcagagcccc 
cagagttttt 
ttagggtctg 
acattctatg 



ggtgactaga 
aatcattttg 
taatttacgc 
tttctctagt 
tacttaaatt 
atagttatta 
tgatacatta 
gactcaaaga 
ttgagaaaca 
aatgagagga 
gaaagtagag 
agaaagtttt 
ctagtctgac 
gcagcctgag 
agctatgttc 
tcttaagaga 
taattagaaa 
ggcattcaca 
atttcaggat 
ttgagtgtat 
gacaattaaa 
tttgatgttt 
tatttaaaca 
catcttaatg 
aaaatgagac 
gcacattttt 
gagatcacag 
tctctggagt 
tacgtaccgt 
agattattca 
cttcaaatta 
tggaattcta 
ttctcagagt 
aacatgtaac 
tataagtgct 
taatattcaa 
gtagataaat 
actttatagt 
aatataaact 
aaaaacagaa 
tattcaaggg 
tcattctata 
tatctcaaca 
ttaatacttc 
gttaagtaaa 
agcctagtct 
gatatgccgc 
ccaacttata 
cttatttact 
tatactgtac 
atatgttaac 
ctttaaaata 
tatctcaaat 
agttctacca 
cttcatttgc 
agaactggcc 
tactttagtg 



tcattaatca 
gggcaacagg 
atgaaaaata 
aataaggatg 
aacccttggt 
aaggtttttt 
aaatagctga 
ttttctttag 
gatgtcaaga 
ggagggagtg 
gggggaagaa 
ggccagggag 
aggaatggat 
acagcttggg 
tcacagcagg 
gtatgatatt 
tttatttcag 
tcgaattcac 
actattaaca 
acattattct 
gcaaactgtg 
ttacagactc 
tactcttatt 
tctgatgctt 
catctagatc 
agtacacggt 
tgcttttttt 
caagggctga 
aggttttgtt 
ataaaggtat 
aatcaacatt 
taatctctta 
attattttta 
tgctacaaaa 
atgtttttaa 
agtaattaat 
tacctataat 
atttttgagg 
agtatcacca 
aaagcttttt 
aaagataaac 
aaactgaaat 
agtctaatta 
agtatttttt 
cttgtgtcat 
ccattagtta 
agtgtgtgtt 
aatgagaaca 
tggagaaaat 
acaagataag 
taatatgttc 
atgcatcctt 
tggaacaata 
gagtagcagt 
aaagtaaggg 
tctgactact 
atatacttaa 



22020 
22080 
22140 
22200 
22260 
22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
23100 
23160 
23220 
23280 
23340 
23400 
23460 
23520 
23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24180 
24240 
24300 
24360 
24420 
24480 
24540 
24600 
24660 
24720 
24780 
24840 
24900 
24960 
25020 
25080 
25140 
25200 
25260 
25320 
25380 
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cttgaggggg ccagaattgt tttggcagag attttgctaa gattgtttaa ctaaggtgtt 254 40 
ggtggtggga gaaactggtg agatttggaa tagtagaact gaggcgtatg ttacttaaca 25500 
tctctcaatt tgtattctca aagtgttaca aataaaattg gaccatcccg aggcaatctg 25560 
ccaagacaca ttaggttatg ctgtggtaac aaactacttc caaatcccag tggcttccaa 25620 
ccctaaatgt ttatgtctca ctcagattac atgtccattg tagttcagct gtggctctgt 25680 
ccaagtcacc ttcactctga gagtgaagta atggagtagc ctgtgagtat ggctgagtat 25740 
cacctggcag aaggacaaga gagatgtaat aaacttcact gacactaaac ccactgcttg 25800 
gtagtgatgc atgtcagtta tataccatta gcccaagtca atcacagggt cagacctgag 25860 
ttcaacagga cagggatgta taatcctcct gctggaagga gtattgcaaa ttacagacaa 25920 
gttaaaagtt agtagggtga gaagcataat tatttcacaa gaagaggaga aacattgtta 25980 
ataatacact gttattaatc cgtggaagat aactattaat ataattatta taacgatgaa 26040 
ataatatagt aaattctttg aaagtaaaat tttaatttat gatgttaaaa tctcataaaa 26100 
attttttatg gcaatataag gatttttact tcacaagcat cttcataagc atcttaacaa 26160 
aaatattttc acgaaaagca aaatagaggg caatgtgaat ctgatgatac aattaagtgc 26220 
cccaaccccc caaccattgt acagatagcc tacaaagcag atgtgatatc caggttttta 26280 
tagctgatgt ttggggaaat ggaactgaaa attcaggatt ttgtttggcc actccacttt 26340 
ctttcacaaa atgagaaggc aagcactaaa aaaataccaa agaggtggca atcaaaagca 26400 
gtaatatttc tctcaagacc agggtagtcc aacataaaat aaacatctcg gactggcgtg 26460 
ttctcttgta ttctcatgct tgatgtattt cttagcaatt caaatcagct tgagagttta 26520 
cggataaaag gagttggggc tgcatttgct aaggtgttta gattcaaagc aaaagaggtc 26580 
tgaatagata ctgtagaaag cattaagtat ttactgtttg agggaaacac atctttgaaa 26640 
tatatgtgaa taatctggaa aattacagat ttacttgatt ttaccaatgt tagtgtttag 26700 
ttttttggac cccttttcac tggagcatga gggatttcct aaaatgggtg tgttgttaag 26760 
tctagaatga ccttactcaa aagaagatgg cgttacttat gggcaataaa gagaagagta 26820 
atttattacc tttgatgaat tgttgatgaa tatctttatt actggatgtt tctatcacaa 26880 
ttcagaagtc agctgaatct tttcactttt aagcacacag gtgtatcaga aagcaatagt 26940 
tggagaggca atgtttgagt ctcagttttg ctgataaata gatttgtgtt ttggctattt 27000 
ttctgtggtc cttggttttc tcagttataa aataaaggaa gagagttatt tgatctcccc 27060 
aatatcactt ctagtaaaat gtgatagtta tgcctcctcc caaatatatg tgcaagatga 27120 
gtcacctatt gataatttga tgattataaa atctttgaca aagttcttgg aaaaagatgg 27180 
ctgggttctt ccgacttaaa agaaatgagc aacagtatgt acaacaaaca tgttttcctt 27240 
tttattcaaa ccttccttta ctcaattttt cttaattctt cttttctaca catccatttt 27300 
ttgtacattt taccagacta attattacat cttttcctgt aattctttac cttttaattt 27360 
ttatttttgg ttgtgatggt ttaaggtcac cacatatggg tgcacagatt ttatgttata 27420 
caaatttagg ggacaccagt catattttgg tggaataatt tttatgaatg ttaatgacag 27480 
ttgtgggagt ggtcaaaatt tactttatta ttttaatatt gtgtatttct tgcacagaat 2754 0 
tctcattttt tgagtattat atcagggcac tttgtatgac aaatgtattg gtcttactaa 27600 
atgtaatgtt agtgtaatac ctttttgtct gccattatta ctttcttact caattgtaac 
atacttaaac tatactttta tacatgattt gattgaacct tctaccatcc atttttttct 
ttgaattatt tagtgttgat aaggcagatg atgaagatga tgaggattta acggtgaaca 
aaacctgggt cttggcccca aaaattcatg aaggagatat cacacaaatt ctgaattcat 27840 
tgcttcaagg ctatgacaat aaacttcgtc cagatatagg aggtaagctt gagttaccat 27 900 
tctgggattt tgatgaccat taaacatatc attactggta ctattttaat cattatttct 27960 
tattccttga gtgtctgatg tatgtcagag aatctgctga atttgaagat tattggttag 28020 
aatggtcact aatgttttga ggagttcaca gtctgaaata agaaaaaaga aaaaaaatta 28080 
atgtaattat tttccaactg ctacaacaat atctgaccga gttttgataa aactgcctaa 28140 
gactgagaaa tacttcataa aagagctaat atttgaaaga taagatacag aataatgacc 28200 
atttttataa ataatgaata ttttacattc tcatttccaa aattatactt tttcatgcca 28260 
tattgcactg ggtaagacta gagtataatg ttgaataatc tcaaattgtt cctgatatct 28320 
ggaaaggtaa tctctatttc actagtaggt atgatgcttg cttaaacgtc ttttttaagg 28380 
acactttttt tggttacctt ctcttctatt ttgctagggt tctttctttc cctacacatc 28440 
tacaagtatg tgtgatgtga aaacacagtc ttcaaaacag gtttctttct ggttagagaa 28500 
aaatatcaga aagctagaca gaagtgtgtt taagaaactg agcagtgata gggaaaagaa 28560 
ataaaaggga gaattttgtg ctcaaccctg gcatcatata ttagtagtat ttgctatgta 28620 
tctatatggg cagtgaattt cttgcattta attaatattt gttttatttt agtattttgg 28680 
aagcttcctg tttttgttct agttttacct tcgtactttt atcttaatag ccttagtctt 28740 
tcatttatta gccctttact cgggattctc agttgttatt gatattattt aactcccaca 28800 



27660 
27720 
27780 
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tactgtccat acaacaatta atgatatttt tctattttct ttttttcttc tcctttgctt 28860 
ttttattttt aattatatat tttctgcttt gtcacaacat gtaatatttg taaatttgtt 28920 
taacaccatt ttcccaatct ttgctttagt ctttaggctc tttcaatata tgctggaggc 28980 
tacaatcttt aggaaactaa cacaagaaca caaaaccaaa taccacccat tctcacttat 29040 
aagcgggagc taaatgatga gaattcatga acactaagaa aggaataaca gacactgggt 29100 
ggggagtggg aaggaggaga ggagcaaaaa agataactat tgggtaatgg gcttcatatc 29160 
tgggtgagaa aataatcttt acaggaaaac ccctgtgaaa tgaattcact tacgtaacaa 29220 
accttcactt gtaaccccga acctaaaata aaagttaaat actaattaaa aaaattctca 29280 
tcattggtat attttgctaa agttatttta gtcatctctt tattgagtag atttttcaga 29340 
aagttgttgc atgtaaaatt atttttctat tgatttgaaa cttgaaggac atagtggcta 29400 
catagaaaat cctcaattta tgttttcttt cactgacttt tcactttaat agcctaattg 29460 
cactggtaac ttgctttgtg tgttattttt gaaaagtcta atgccagtct attattcagt 29520 
ccttgtaatt tacttagtct ttttttttcc tcctggaagc ctttaggatt ttttaagaaa 29580 
ggtttcaact gaagtctttt cttagtatat gtttcagagc tgttcaattt gggtagccgt 29640 
ttttctagac attgatacat ttttttcaaa atgtaatttc aggttttctt ttcaaccttg 29700 
ggaaaatgca tttacttctg tatttattta tttaggaaca taaagttgaa tatttgttct 29760 
gttacattgt tgtggttttc ttcttcacga attcccacta taggtatgtt agtcacactt 29820 
tgcctgcctt gtatttgcac aatattctct ctgactttct gacttctctt catctcagtt 29880 
gtattctctt cattattttc ctgcctatta ttaaactgtt tttagttgat tcattggtga 2994 0 
ccatcataat tttgtctttt tcttctattt tttaaagttc aaccaatttt ttatttattt 30000 
cttacttttt gtgtatttgt gtgtgcgtga gaaatgtatt tatcttgctt agtttctgaa 30060 
ttattttatt caaggtgatt ttttcttatt cctaaatttt tgtttgaata tacttaactg 30120 
tggggagtat tttcttacag tttcttctgc ttattatttt gtgtgggaat tttccacaat 30180 
tgatttgtat tggatcttcg tatcttaatt ctgcaataat tattttgtcc ttttaatttt 30240 
gtggtttctg ggtgttctac aagatcttag cttattgatg tattttaatg tcacgatagc 30300 
aaagttcaga tactttggat ttgattgtgg tgttggggtg tattatttat tttgtggtcg 30360 
gggagaaatt ctgttacttc caattcattg attcttcttt tccttgtaga agcatgaggt 30420 
actccccttg acttttcctt gtcactatac tatttcctaa cctcacttct gtttttgttt 30480 
ttgacatttt tcttggccca gaaacaacct attttgaaca tcactctttg acatcacaca 30540 
tacttgtaga tccttttcct cttagttctt gtgctggttt ccaaatacat gtttaaatac 30600 
tttcatatta atggtacatt tactttcttc agtggacttt ggcatctacc ttaggcccat 30660 
cactgactcc cttctgtctt ctacagagcc tatttcaggc tgctgtgccc accataaact 30720 
gttgccttag agtagagatg agacaatgtt tgaaagattt ctttccagat ttggtatttg 30780 
gggttttatt tgtttgcttg tttgacattt acagtcattt tgaagtgtgg gcattttttg 30840 
tctttaagtt atactgagaa cttaattttt gtttacttct acttttctct tttttttata 30900 
tacatttttg aaggaaatat tgggagatga agacacaggg aatttatgta aattttctca 30960 
tgtactgctt tagactttta cgatttattt ctttgtaatt tcttctttgt cacatgcagt 31020 
agttagtgaa tattgtttaa tttcaaaaat tagcctgtgt tctaggtata tttttgttat 31080 
tggtatctac attaatttaa tctgggtctg aaaacataaa cttaagtcag atttataatt 31140 
cttgttgctc aaatgtatag atgtataatc ctatacatct actgagagag gtgtgttaaa 31200 
aatctctgac tacaatgatt gacttagcct ttaaaaattt ttagtccagt aagattttgc 31260 
tttatacata ttgaaactgt tttaggtaca cacaaatttt ggattgtttt gtctcctttt 31320 
ttaaaaaaag ttttactttt atttttatgg atacatagtg gatgtatata tttatggggt 31380 
acatgaggta tttttatata cacatagaat gtgtaataat cacatcagga taaatggggt 31440 
agtcacctcg agtatttatc atttcttttt ggtatgaaca ttccagttat actcttttgg 31500 
ttatttttaa atgtatgata tattattatt gactgtagtc accctgctat gctatcaaat 31560 
aatagatatt atttattgta tctgactata tttgcataca cattaaccat ccccactccc 31620 
ctacctcact accctctggt aaccatcatt ctattctaac tctatgagtt caattgtttt 31680 
gattttcaac ccctacaaat gagtgagaac atgtgacttt tgtttttctg tgcctggttt 31740 
atttcactta acataatgtc atcagttcta tccatgttgt tgcaaatgac agaatcttat 31800 
cctttttttt gtggatgaat ggtactcaat tgtatacata taccaaattt tctttgcgat 31860 
ggttacacct tctgttgaat agcagtttta tcattaggaa acaatctttc tggtctgtag 31920 
aatacatttt gctataacat ttaatttgat tgatattata tatctatatg tgttttcttt 31980 
gagttatgat ttgacttttt tttttttttt tttttaagat ggagttttgc tgttgttgcc 32040 
caggctggag tacaatggtg caatcttggc tcaatgcaac ctcctccttc tgggttcaag 32100 
caattctcct gtctcagcct cctgagtagc tgggattata ggcatgcacc accatgtccg 32160 
gctaattttg tatttttagt agagatggag tttctccatg ttggtcaggc tggtctgctg 32220 
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gtctcaaact cctgacctca ggtgacccac 
caggtgtgag ccaccgctcc tggccaattt 
gttttcagta tttttgtgtc cttcattaag 
cctatttcct aattatcaag tctggcaatt 
atatgtttaa ggtaattatt gaatattttg 
tgttttgtta agctatttaa tttgttctct 
ttttgaattc attaaatatt tgcagtttta 
tattcattta taaattttag tcagtaaaat 
gtaagtttat cataattcaa agaaatttca 
tgctattatg tccatatgtt ttattctata 
tattgtttta aggtgtttaa aaacaattaa 
taattccttt cttaaatacc ctatttccat 
tggtgcaagt ctctcgaaat ggagtatttc 
ttcattttga atatttccaa tcgatatgga 
aatatcactc aattttctcc tgacttttac 
taatcgtgct attttgaaga taaaatgtat 
agtctttgat tttctaccat tttattttag 
atattgagga gtttacagag cttcttgaat 
ttctcggcct gtatttcatc agacattttt 
tgcgggtatg tttgacctct gctatgtcat 
ccctatttgt gtttcaatct gctctttatt 
actggtagta tttacacctg atttttaaat 
cttttcttct attttcttat atttattaac 
gatgtcttga taacttgagt aactttttaa 
aaggttatgt ggttgtattt catggaatgt 
tatgtgaaaa ataatggaga ttctacatga 
tgtttcgata gtcaggttga gtaaagaaaa 
ttccttggta aagcttttat tttgcctaag 
cctgaggttt gtaccaggat cctctctcct 
actctcttcc tgttgtttta gatgcttagg 
atacgtacaa tttaggaaac agcaaatgcc 
ttaaattccg tgtgtgtgtg tgtgtgtgtg 
tgtgtgtgtg tattccaggg aagtcctagc 
tctctccagc caaataaaag atctaaaagc 
ttgctgtgct gtcaaacagc aaatatacca 
aactcacgtc attttcatta tcttcaggga 
tgctctgctt tcttcaccat ctgtgtgtgt 
ttgggttgga tattttgata ttaattaaac 
gttttactcc ttatacttta ctattattat 
ctatatatag ttccctaaaa ctataaatca 
aaatataaac taggttctct taatatttac 
ataggctttt tgtagtgcca tgtgtaggta 
gcattttagg tacatttatt tcagatcaag 
tgatagcatc actgtgttta atacattagg 
tatcatgaat agtagcattt ttttctgtgc 
tgaagtaatt tttgtatgta tcgctttctg 
tggtgatgct atgaaaactt ataaaaccaa 
attaacatag aaaggttttg ttcatccagt 
ttttcctatg tttttatttt tttgaggcat 
agtggtgtga tcatagctca ctgcagcttc 
ctcagcctgc tgagtaattg gtactacaag 
attttttgta gagacggggt ttcaccatgt 
aagcgatctg cctgccccag cctcccaaag 
ctagcctatt ttcctacttc ttaacaactt 
tttcagttat ttaaatagtt caacctaata 
cctctcttga gacacaggta agacaagcac 
tggtaagaaa gggagtagtg tggcttgttg 



ccacctcggc cttctaaagg gctgggatta 32280 
gattttttaa tatctttttt tcatacttta 32340 
taattttttt taggaaatat gtatattgtt 32400 
attttctttt aaatggagtg gttaatgcag 324 60 
tgttaaatcg actacctttc tattgttttc 32520 
attgcttata tttttctctg ttctttcctt 32580 
ttttatcctt ctatcattgc attgtacaca 32640 
atgcacttta ttagatttaa tcataaatca 32700 
aacctttttc cataaactac agtgtcattt 32760 
tttatctgaa ataccaaaag acatgattat 32820 
cttataccaa taaattatca tttctggtgt 32880 
ttagaaatat attctctgca gtatttcttt 32940 
agttttatat gaaaattatt ttgtattgcc 33000 
aaccctgtta tttcctttcc aaacttcata 33060 
tatttctgtt gaaaagtcac atttcaacat 33120 
tttatctctg gacaatttta acattttctt 33180 
tgtccctaga tttgctttgt tttcatattt 33240 
ctgtgggttg aaacatagct gcttttgaaa 33300 
cttctgtctc aatttctctg caactccact 33360 
tacatctcat acgctcttct ataattttct 33420 
accgacctaa tttcccgttc aatgtttctc 33480 
caatttcagg tatactataa tatttgcaat 33540 
agaagttatt ttaaaatctt tgatcattac 33600 
aattttcttt tccattgttt ctctctgttt 33660 
gtcttaattt tgtgttgaat gcgagtaggg 33720 
tattatttta ttccagagag gacttaattt 33780 
tcactgtcat ttggtcagaa ctagcgtaat 33840 
accaaccatt tgagatgtct ggactgacag 33900 
ttgttagtca gacatcccta aaatagctca 33960 
agctgcttat atttggtttg tcaacatcac 34020 
ttgaggggaa atcccatgta gaatgtccga 34080 
tgtgtgtgtg tgtgtgtgtg tatgtgtgtg 34140 
tgtcctgata gcccttaatt cgaagttttg 34200 
tctaagctgc tctgttctgt tcagtcacta 34260 
aggggaagaa gtggccaaag agtaatgtag 34320 
attgttacat caaattctgg ctatctgatt 34380 
ttatttaaaa aaaatttata gttttcttta 34440 
cacaaaggaa gcatcagagt ctgtatttct 34500 
tttgtgtctg ttttctctat atatagagaa 34560 
ctatattcct aattgctaaa tcaatgctta 34620 
caagtatttg tagaatagaa tccattgtat 34680 
ttttgatagt tgccctactc catgagtttt 34740 
ctatgaatgc tcaaatttta tatacaattg 34800 
ttaaatatgt cctggatacc ttgataactg 34860 
atctgactac atcataggtg acttaattgt 34920 
tccaaaatat gttttagtac cctttgtctt 34980 
cttattgagg tcttctgaac ttaatggaga 35040 
taagcttcca ttgctttatg ataaatatta 35100 
ggtctcatgc tgtcatccag gctggagtgc 35160 
aacctcctgg ggcttagctg atccctccac 35220 
cgtgccacac cccgcacagc taattgctgt 35280 
tgcccaggct gatctcaaac tcctgagctc 35340 
tgctgggatt actggcataa gccaccatgc 35400 
tacaagtaga ttctatgact ttccaaacag 35460 
taaaaattta atatccagat taaaagacat 35520 
cactagggtg acactgtggg ggtaagagga 35580 
ttttattcca tctctttatc tcttctatgt 35640 
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ttctagcgat ggggacatgg gggagatata aagaagattg gtctacgttt taactgggca 35700 
tagttgcaac cttagtcttg ttacttgtta cttctttttg gggtaacatc aagtggcctt 35760 
aggtgtaagt taaaggcttg acgtataaat aatgactttt atggttagga ggtataaata 35820 
attttttgtt gtttttcata ggcattttat atggctcaat tccctggctt gggaaatctg 35880 
gatgctcttc tgtatcttcc acctgcctta actcctttca tcagcaatct agtctgaaga 35940 
tatgtgctgc tgatgcacaa ctctcttgag gtcaaaaaac ctgtgagacg ctgctaacca 36000 
agttattttc ttttggcgtg ctttataccc acaagaaaat aaaatatcct aaaagtgcta 36060 
aagagtaggt gtgcagctct tcctctgctt tttcttcttt aagtcctatc acttctattt 36120 
ttccttttcc ctgatcaggt atgggagcaa gagcaggttg ccaagtctgc tgcctaaaca 36180 
tatatttaag tagagaatga gaaaccagcc ttcaattttc ctgagaagac attcttggga 36240 
atacacacac acacacacac acacacacac acacacacaa acacacacaa acacacacac 36300 
acatgccatt taaatataga tatgtggggt gtgtgtgtgt gtgtgtgtgt gtgtgtattt 36360 
tatagtgggg atatacacta taccatgcct gtattgtaag gtgcgaaaag taatttataa 36420 
tgaaatattt ggcattctat atgcgaatgc tcaggtttaa ctacactgaa agaaattatg 36480 
aagcttccag atgcttgcat ttatacacag aatgactgca tattggaaag cctgcgcaag 36540 
ttaagagata caaatgtgtc ttatcagtga ctccagtact atgagcaaca ttgtgattgg 36600 
tgatatggct atctaaaatg gtgtacaaca ttttgtaaaa ttcagaacaa aataaatacc 36660 
atatttctgt gttttatttt tcaggaattc ctagcaaaca cagaaatact ttgtgtttat 36720 
gtgtaaaaaa acttatcttc taggttcaga ctataaggca tactgtttta cctatatgaa 36780 
tgctagacag gatattcaaa atcttgttaa atgtgggaca atttttcatt acatatgact 36840 
gttttatgaa gttcaagcac ttcatccact aaacactact actaacaatc ccaatttttg 36900 
tgacgaacac aaacagctgc ccagaaattt tcaaaataat ttacaggcga catggtgttc 36960 
tcattgagaa ccactggtga acattgttca ttaaaactta ctgaaatgat agaaatgttg 37020 
ttctgtattg accatttttg gaagcactgg ccacaagtgg ctgttctgta cttgaaatgt 37080 
gattagtgtg gttaagaaat atttttattt catttaaatg caattaattt ctatttaaat 37140 
ttaaacagcc atatgtttct actggctaca gtattagaca gctcagcctt aagaatctac 37200 
ttatccctat gtaccttttc atccacatat atctaaatgt aacctggaaa ggttgattgt 37260 
caagtggtgt ttgacatttc cttgttaatt aaggtctcac aagaagtatc taaactcgag 37320 
cttggtatct atctttgggt caaagacttc tgctgtatag tctagggaaa ctggaacatg 37380 
ttaatcataa ttatttcaaa gtctatatgt ggtatggacc tggagttttc cttcttggga 37440 
catgtttata agctctacca tttaataact ataggttttc tgttaactag atcgatattt 37500 
tatttttctc aattttataa cttgtaacat tattttcatt tttatttact attgaaaaat 37560 
ttatctcctc ctgacttttt cctgattcat ggattatttt ttaaatgtta tttaatttct 37620 
aaatatttag ggatttttaa agatatcttt tttattaatg acttctattt taactttgtt 37680 
gtggtcctag aatatactta tattatttca accttatata agcatgacac aataatgaaa 37740 
accaggaaga taatattgat atattactac cagtgaatac ttggccccac ttcccaattg 37800 
tctcaataac gtattcacag aaggatgatt cagttcagaa ataatattac atttttgttg 37860 
ctttactagt gaatttcacc acatatagaa gaaagaacta ctataagtct tacacaaatt 37920 
ctttaaaaaa aatagagaaa ggatactcaa ctcatttcac taaaccagca aaatcctgat 37980 
aactaaacca gacagggata ttacaagaaa aaaaaattat acgctcatat ctctcatgaa 38040 
tgtatcggtt gaataaaaca gttggtacat ttatgagaaa caattatttg caggtaaaaa 38100 
cagttgactc cctgaataaa ctttcagaaa atcatcatag aggctgctct gcctatggag 38160 
tagccattct tttattcctt tgctttctaa aaacaaaaca aaacaaaaac ttgctttcac 38220 
aataaaagaa aatattcata gaaatttgag taggttgata aaaataaacc tgtatatggg 38280 
caaggaccag aagagagtcc aaaagcaatc agttgtgtaa gagtattcac actaggtggt 38340 
tgcttgcttt cttggaattg tacctatatc atttacaaac aataatatac ttggtgaaat 38400 
gaaatattta aatttgaaat aaaatttata taacataaaa taaagaaaaa aaattatgtg 38460 
tgataactgc aatatctgaa taagtgggac tatttctctt ttttaaggcc tattctgtga 38520 
caataagaaa agtctataaa actctccagg attagagttt ttaattgaac taaattaatt 38580 
ttaatataca aaagtgacca aaaagctaca taatctgcta aagatatctt agtgaaatta 38640 
gaatagtgtg tccaagaaag catggttaaa atcaattata aagaaaagat ttttaaaatc 38700 
tattgtagtt tatattttct cactgatata ccaattaccc acctctacaa ttggctgtcc 38760 
taactattgc ctttgctact tggaatctga atgtggaaac tttattgatc acacagcttt 38820 
tttcttttta aaattttaaa cttttcttcc tatttctgtt ttctggatgt tctttaaatt 38880 
gccctgaacc tctaatccca ctgggagaat atctacttac atcctgttga gcaacttaaa 38940 
gcaaagtact taaaataaca gcacagaaag aaaccttgca aaaataatcc agtgtcatct 39000 
ctacctgcct tcacatttat atatacatat gtgatttgcc atggcagtac atggttttct 39060 
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catgcattgt 
aaacatagtc 
tagttttatt 
tatcttgcaa 
gtcaatttcc 
aatcaaagca 
aattatattt 
ctacttcctg 
tagcatctca 
cacatacata 
tactgatatg 
acacttcagt 
tgtatatgtt 
actcatatct 
tgtcccagga 
aaagt aggca 
atataagaat 
cttacagatg 
tctgtaaatc 
tcagaaacaa 
attgtaaaac 
tagcaaatgg 
aaaat taaat 
aacaaactat 
tattgttcta 
ttggtagctt 
tagatattca 
caatcatagc 
gtttgaaaag 
agtctacaag 
gactgatttg 
taaaagatgt 
ctaagattta 
tgcagtgagg 
tccaattaat 
gtatacctcc 
tcagatcaac 
tgtatctaag 
ggcataacaa 
aaaaaatcag 
gagaaagcag 
ttccagcaat 
atatttaagg 
tgtattatgt 
gggagtgtca 
tggaggtaag 
gacaatagat 
tacatttttt 
cctgactcca 
aggaatacag 
cagcagcatt 
tagatgagaa 
gtgagatgat 
gtaatgagat 
tattcagaaa 
agaaagtaaa 
acaataaaat 



tatgctattg 
tgtgcttact 
tttctagttt 
ttgaataaac 
aaaagcctgt 
acaatgagtc 
taaagcatat 
gtttattcat 
agtttacaaa 
tttaacaatt 
attattcttt 
atctctgctt 
tatcttataa 
ataacctctg 
acaatttccc 
ctcggcgttt 
ggtatattgt 
gaactaataa 
taactataac 
aggagaaaca 
cttctataaa 
aattgcattt 
ctgtgttatt 
gtaagtgttg 
aaatttgata 
gagggctctg 
gcctgccaga 
attgcgatta 
gataactgag 
gtcatttttc 
caaaccaaaa 
tttatgattg 
tttaaaacaa 
cccacggtaa 
atggtaagta 
aaaatagaat 
atgtacaatt 
aatctgtatc 
attatcatct 
gaagtctgag 
catgcacaaa 
ttgatgttac 
atcttatcaa 
ataaattgta 
aatgcagatg 
agtccaagtt 
cttcaagggc 
tcaaaatagg 
aggtattgaa 
tgcaggccta 
ttgccttata 
agagaaagga 
ctaagagaag 
gaatgtatgt 
agtcagaagg 
gaggataaag 
ggattagata 



caaaaaagag 
ttcaagatgt 
taaattaaaa 
tttggattga 
attttaagta 
attattaaga 
caccagccgt 
ctttgtcggt 
atggtgaaca 
gttaactaag 
tgtctcattc 
ctatgtaact 
ttggctaaga 
acactgtaaa 
tctcaatgta 
gctgatttat 
ttataccagt 
atctattgtc 
tggcttaatt 
gttaaagcaa 
tcatatgacc 
tcaagtattg 
aaatacgtat 
tgtgttttat 
gacattgttt 
tcagattgaa 
cttgtgtgtg 
tgagtgataa 
catgtatgac 
tgcattatct 
agattgtttt 
aagtgttaat 
tgtctaatca 
ttgaaactga 
agaaatatta 
tcgcatgtga 
actttgtgcc 
ttgttattcc 
gatccaccaa 
gggaaagtag 
gtcacagagg 
tggaatataa 
ggacagtctt 
tcccttggga 
tacattttta 
agggttacag 
ttacctactg 
taattttttg 
atgaagagac 
tgtaccccaa 
gatttgctat 
gacaagcagt 
attcagatta 
aaaatatagt 
gtaggagcaa 
aaggagggaa 
gtgataactg 



atctttgatc 
tatttagagt 
tataatgtct 
gaaaatcata 
atattactca 
agcaattaca 
ggtttcaaaa 
tgctgctgca 
gaataactca 
tatcttctat 
tgggtgcaat 
catgctaaaa 
aatgttgatg 
tgtagaaaac 
gacaacgttc 
cagtttaact 
tataactact 
tcgttgttaa 
ttatatttat 
ctcatctcag 
tacttttatt 
tgtaggtcta 
gcaatataat 
taagtgtatt 
tacaaaatgc 
aaatgccaag 
tattaaaaga 
gtactcggtc 
taaaaacatg 
catgtgcttt 
ttataaaggc 
atataagtag 
tatctttgac 
tgtttatgta 
aaaataataa 
gtaatttttt 
aggtattggc 
aatgtgctgt 
gaagaggtcc 
tcaatcagag 
aatgaaacag 
agtgtaaagt 
gtaagccatc 
gaagataaat 
aatacatctc 
agacccatta 
cagttgcaat 
agaatttggt 
tttgaagatg 
atgctcatgg 
cagggaagaa 
atataataaa 
agaagatcac 
tgcaagaaag 
aaccgagaga 
tgataatgag 
gcattcacaa 



tccatgtgta 
ctggaatgac 
tagtttttga 
gtgctaagaa 
ttagtattgt 
gatttctggt 
ttaatatgaa 
gattcaagga 
aaaagattcg 
attagtagca 
tgctctatgc 
tgccagagtt 
tcaggaaacc 
atttatcctc 
agctttgaat 
gaaaaaaaaa 
tataagtggt 
ttgactttca 
tggctacaat 
aagtcatcag 
taagatatct 
gaaacatagt 
aaaatttttg 
tttaattgca 
ttcaagtttt 
accaggagac 
ataaatacat 
atataggaaa 
ccacaataac 
atctctatat 
tacatttctg 
ttattaagat 
ctttcttttt 
aacagcattg 
ctgccacaga 
atagttaatt 
aactggagct 
gtttggtaga 
ctgaagaaat 
aaaaaaaaga 
cctgataaga 
aaggaattga 
ataataggaa 
gacaggaaac 
tcttattgca 
caggttattg 
caagctggag 
gacttattaa 
gtagagtcat 
tgcattcagc 
gttaggacaa 
gctcaagaca 
agatctgagg 
agaagcacag 
agatgatatt 
ttacaaatgc 
gtaggtggtg 



ttgatgaatt 
aattttctgt 
acatcactca 
agagtactta 
tcatagtatg 
tttttttctt 
atattttagt 
tcagatcaaa 
tcatagcttt 
atattggact 
tcatcataaa 
atttttctaa 
aagacagtta 
taaagaaaat 
atgtgaaata 
atctcttatt 
aataaatgta 
tgaaaatcat 
tcttagctag 
cctatgttaa 
aattgatggc 
tttgtgtact 
cttgaggaaa 
aggggtaaaa 
tcactgaagt 
tgagactgaa 
cttgcctggt 
aacagatgaa 
agaaactttc 
aattatctgt 
catttcataa 
ttctttaaaa 
gttgatattt 
gaccagttga 
gatctctcgt 
ctgaatgaat 
aaactatacc 
actatgcaac 
ctggaaagaa 
cttactggca 
tcagggggac 
ggcaaaaggg 
gcttgtaaat 
attttaccag 
gtatagaaga 
tgttagtctg 
atgaaaataa 
ttatgtgtaa 
tatcagatgt 
aggaatggtc 
aatatagatg 
ttaagaatgt 
accagatggt 
aaaggggaca 
ataaaatctg 
acatgagaat 
ttgattagct 



39120 
39180 
39240 
39300 
39360 
39420 
39480 
39540 
39600 
39660 
39720 
39780 
39840 
39900 
39960 
40020 
40080 
40140 
40200 
40260 
40320 
40380 
40440 
40500 
40560 
40620 
40680 
40740 
40800 
40860 
40920 
40980 
41040 
41100 
41160 
41220 
41280 
41340 
41400 
41460 
41520 
41580 
41640 
41700 
41760 
41820 
41880 
41940 
42000 
42060 
42120 
42180 
42240 
42300 
42360 
42420 
42480 
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ttgttaggag 
tatccctcct 
gaataaatat 
ggagtgagtg 
aagaagatga 
ggagggagag 
agtgtgatat 
agagcacaat 
tgttcagttc 
aatgcattca 
tagggttgcc 
gcagttatag 
cactgtgtcc 
tccaacgttt 
tgttgacttt 
tcttcaacat 
atcttgctga 
ttattacagg 
tatgtctgct 
tcttcacatg 
ttctagccat 
gtggataaat 
tcttaaatta 
aatttgctaa 
agaaatgaat 
gatagtaaat 
gggaagagaa 
caataagctg 
taacaatatg 
ctttagtgga 
tgtttgagca 
atgttttcaa 
aagattcaag 
ttacaatgat 
gcaaattaca 
ttgctattga 
gaacacttga 
taggttaagt 
taaaatgagc 
taagcaagag 
aaattaatat 
ttctttttaa 
cgacattatt 
actaatgaag 
ggtttgatta 
caagaacatg 
ttgggggata 
ctgactggca 
gtaaaagctc 
agattagtaa 
ttaattgtag 
aactttgaaa 
ctgcagaact 
atatacttct 
ctgctagaaa 
ttacgaagac 
gtgcatgctg 



aaattcagtt 
tcatcttcaa 
ggttcctgca 
ggaatgtaga 
gataagagta 
ataatttgag 
tcatttttta 
taacaatgcc 
caaacgctaa 
ataatgtctt 
acaacgaagt 
aggccagaag 
ttacatggtc 
ctcttcttat 
aaatgctctg 
atgaatttag 
ttttcctctt 
ctttaccact 
tactcctttg 
actctcatta 
gccataagtg 
tgaatttgag 
ggggctaaga 
acctctgtga 
agcagacctt 
gaaacatcaa 
agtttcaaca 
atttttgcaa 
caaatatcag 
gtattgtaat 
tatgggaatg 
tttgcatgaa 
gatgtattgg 
cttcctatat 
cacttgtaac 
aactagtgat 
taatgtttat 
agaagactat 
aaaattgctt 
ggtgccaccc 
attagttttt 
acttggatac 
ctacaatgga 
actggggaaa 
taactatctt 
ttactagagt 
ttatgtatct 
agtacaacac 
caggaaattc 
gtacaaaagt 
gaactggtat 
gtgagggaca 
taatttaaat 
gaaactagat 
ttctgattcc 
gacaatttaa 
gagaggacta 



ggaatggtga 
aattgatctc 
gggatagtga 
tttaatttta 
tgggagtaga 
aaaatatcaa 
tgttttatac 
aatatgagca 
taatcccatt 
ctactcctct 
agcacagact 
tccaagatct 
ttttctctgt 
cacaagtcat 
cctttaaata 
ggggttcagc 
cattttattt 
ttatagctgg 
gtgtgacagt 
acttttacag 
acatctttgt 
aggatgaagg 
ttcagggaaa 
gatgggggaa 
agggaacaca 
ggttggcaag 
aatcttttta 
acagaaaact 
atttcataaa 
gtttagtttc 
ttggcatttt 
tacattaaat 
tattctagaa 
aaaagattaa 
cttcaacagt 
actaaaacag 
taaaaagaaa 
tttaactaaa 
tgaatgtgag 
ccttaagtac 
tttaagttag 
ataaaatgca 
atatgctttt 
aatagagtaa 
caatataagc 
ttatgagaca 
ttactcttct 
tcaagggcca 
aagatgatat 
agctaccact 
acccttatat 
aggaacaact 
attttccaat 
agaatcttcc 
tcttagtaac 
aaaggagctg 
caaaggggtg 



aggcaggaaa 
atgtccaaac 
tgtagtcatg 
taaactacta 
gaaggtagag 
gaaaagacca 
atagaataaa 
accagcatta 
tctgtgagtt 
ccttcgcttt 
ggtggtttaa 
atgtcattgg 
gcctgtgcat 
attgtattag 
tagtcacatt 
ccataagact 
attcaatatt 
tgaaccttta 
cacactcagt 
ctgtctgcac 
cacattagtg 
aatgttccaa 
agttgaggct 
aacagcctta 
tgtatttaag 
aaagtgacaa 
gtattagttc 
tttaatatac 
gtagtacatt 
taagtaacca 
catgaattgt 
aaaaatatgg 
acactttgac 
tggtattaaa 
gattgtttaa 
gtgaaaataa 
agcattgtga 
caatacaggg 
tgtaatcaag 
tcatataatt 
ccaaaatttg 
tttccataga 
ctttttacag 
cttggaaaga 
agaagaaaaa 
aaatggttat 
gagtaaatat 
tagcaattct 
tatgggtatc 
aaaaattcca 
aatccaatat 
aataaaaact 
cgcaaaagtg 
tgctggaaat 
ggagacaggg 
ggaagtcacc 
agcagtaggc 



gggataacag 
tttttgccat 
tgctggtttc 
ttttgagaaa 
gagagaagaa 
tttagaatga 
tagtgtttta 
ttttaaaaaa 
caaactactg 
tgcctttggt 
gcaacaacaa 
cttgcagaca 
ccctgatatc 
ggcccaccct 
ctgaagttct 
tccaaagata 
ataatagtca 
taatatgcct 
ttttctccta 
tgattactgt 
caagaaaata 
tagactgacc 
ggggccaaat 
aagtgaagag 
gacctgaaag 
aacaatgacg 
tactcaaaca 
acgtgtaata 
gctaaaatgt 
atgcctagag 
atatttttcg 
tcattaaagt 
aaaaaggcta 
gatttagttt 
aaaaacctaa 
acaaataaaa 
tactagggat 
aaattagaat 
gaagcagatt 
taagtgacca 
ttttgctgca 
tttctgaata 
tcaaaatgag 
agctgttcaa 
tatgcaatca 
aacctcgtaa 
ctcacggtgt 
ttaaagccaa 
taacaggagg 
ggtataattt 
taagagatga 
acttagatcc 
atatatcaca 
tgaaactggg 
ctgaggaaca 
tgggctatac 
aaagaggatc 



ctggattgca 
ggatccctga 
tgggtgctag 
catggcttag 
gagaaataag 
aaactgagaa 
ttactaatga 
gacttcataa 
gatcccatgg 
atttgttcac 
tctatttctt 
gctgctttct 
tccttttgta 
aaaagccgca 
cgggattaag 
attctcaaat 
aatcatccct 
aacagttgcc 
ctcaatgtaa 
agtgatagca 
tgaactggct 
tagaagttag 
ctttgtcagg 
agaagaacaa 
atgaagctga 
tggaaaccaa 
aattactata 
ggtgcatttt 
tatatttctt 
aaaagaagcc 
tttggggttt 
aaagagaaac 
tcacagactt 
ccccagtgga 
ttaaataaaa 
gaaccccatg 
gaataagaaa 
tcacttttca 
tgagcatatt 
gtaagaaaca 
aaatgggtgt 
atgcctaaaa 
taacagatgc 
ttttatatat 
gaattgaagt 
cttttcagat 
tgggtaccct 
gtcaaaattg 
ctacagtata 
ggacatgaat 
ctaatataga 
cagttcgcat 
cagtcctaga 
tagaatcttc 
agggcaccag 
ttcaaaataa 
aggctaactt 



42540 
42600 
42660 
42720 
42780 
42840 
42900 
42960 
43020 
43080 
43140 
43200 
43260 
43320 
43380 
43440 
43500 
43560 
43620 
43680 
43740 
43800 
43860 
43920 
43980 
44040 
44100 
44160 
44220 
44280 
44340 
44400 
44460 
44520 
44580 
44640 
44700 
44760 
44820 
44880 
44940 
45000 
45060 
45120 
45180 
45240 
45300 
45360 
45420 
45480 
45540 
45600 
45660 
45720 
45780 
45840 
45900 
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agcaaatgac 
tttcagaatc 
aaaaacaaaa 
attatagaat 
aatggccaag 
gaaaaaattg 
aaatt caaaa 
gaccaaaatc 
taagtaagcc 
ttgtatttaa 
tataatttat 
gcatt ccata 
tctgt ccctc 
ctatagtttc 
aggcaattct 
aagat tttgt 
gtatt agtct 
aaatctatta 
tttaatttct 
acagatctta 
aaaaattaaa 
tgttcattga 
ttagaatctc 
tgtatatata 
gtgacaattg 
aagacataaa 
aaaaggcaaa 
ctaatgctat 
tacatactca 
cctgaaatat 
tacccaatgc 
atgtcattac 
catgtgcaca 
cccattaact 
ctaccccaca 
gtaaataata 
tttaagaaat 
agttccaatt 
atttcttcag 
agtgaatatt 
caagatagat 
gggttagtaa 
gtttgtgggc 
aaatgtactt 
caaatctcag 
tttatttacc 
caaccataca 
tatatatata 
taagagatta 
aaatttcctt 
tagtataaag 
tagacaattt 
aattttgtca 
cacaataatg 
caccctcctg 
tacgttccca 
ctcaatggcc 



agaatgcttt 
ttttttaaca 
ctttttaaac 
gtaaatgagg 
tatccaacct 
taaccaattc 
aattttttaa 
tatggtcaaa 
tctttaactt 
gcattctaaa 
tccattgcaa 
ggctgatctc 
tctcccttgt 
tacacatcaa 
tgccaagtgc 
ttgctattta 
tcttagatgt 
taagtttatt 
tttcttctgc 
ttttaaaata 
tattagaaat 
ctctataaat 
taaaaattat 
tatttcattg 
gttcattgtt 
acctagcagg 
aaaaattctg 
tttcttctaa 
accaagaaaa 
tatatctagt 
tttattaacc 
tttttttaaa 
acgtgcaggt 
tgtcatttag 
acagtcccca 
tgcaacattt 
ccgttaaatg 
tgctgtaatt 
gctatatata 
ttgaatgcct 
acctcaaggc 
tcatagaatg 
aatgttctga 
aaaacttggt 
atgttatagc 
catttattta 
tatctaagaa 
tatatgaaca 
cattctaaaa 
aaatgtacat 
agctactgtt 
ttctcatctg 
catggatagg 
tacatttacc 
aatctccatt 
cttataaatg 
tccagttcca 



cattctggcc 
gaaaacgaaa 
aaaaatgtat 
gatttttctc 
taatgtagat 
aagactgtct 
tttgttttat 
tagctttaag 
ttcaaactca 
catatggaga 
gaaacacatg 
tttctcaccc 
atgtttctgg 
agaaaccctt 
attagcttga 
gataactgac 
gattaattgc 
ttcttggtgt 
atatattaac 
ataaaaaatt 
aaataaaagt 
atgaagtgaa 
caaaataatt 
ttttatttgt 
aaaaaaagat 
gaagatgact 
tgtgtgaagt 
agatgttctt 
ttagaaattt 
ggtcaccatt 
tcactgtttt 
aatattttat 
ttgttacata 
cattaggtat 
gtgtgtgatg 
aacggatttt 
ttgcatatta 
aagagtaata 
tacacaccct 
agaaaacaaa 
aaacagtctt 
ttgtagaatg 
cagaatgaat 
ttttggcctt 
tgattcaaac 
tcagaaaaga 
aaagtaattt 
tgacctaaga 
ctgtgtagag 
atgcatacac 
tagaaaaatg 
tagatttttt 
ttgtgctgtg 
cgttaaggaa 
gtctctcatt 
gaaatatttg 
tccatgttgc 



tgtgcagaaa 
atcgaattgc 
gtggagctac 
aataatattg 
aggctatgat 
accttgaggt 
aaaaattcaa 
aggcagttca 
ttttgataat 
aaactggtgc 
tttctagggt 
tctctttttc 
gcttctagtt 
tttattcatt 
agaattttga 
ctgcatatgc 
agtactttag 
ttgcagtatt 
tttttgtcac 
tagctacttt 
cattttcatc 
agttcctagt 
ttatttctta 
ctgtcatata 
atcagaaaag 
aaaaatgcaa 
tatacctatt 
cataaaaata 
aatttagata 
gttatgatca 
atagtattat 
tgctattatt 
tgtatacatg 
atctcctaat 
ttccccttcc 
ttaaaagtaa 
tttagtaatg 
ttattattaa 
agacctctca 
ttcaaaaatt 
gaattggtta 
tactgcttgg 
gtctagtttg 
gctttaactg 
atactcttat 
gtgactttgc 
agcatatata 
taaatttagg 
atacatatgt 
gcacaaacac 
gtagcctcca 
aatttatagg 
gtgaagtcag 
tttattatct 
ccacacacta 
tatttatctt 
tgcaaaagat 



tatgtaacag 
tttctacaaa 
agaagtcaca 
tttttccttt 
tactaaccca 
atctatcttg 
aatattcact 
aaatggtgag 
aaactctcaa 
tcatagagat 
ggttcatgca 
tccctgccac 
ttaggagata 
ttactatatt 
aatggcaaat 
caatggtggg 
gaaacacatt 
ttaaagcttg 
tacataaaat 
tgattcaaca 
gctgtttctt 
attaggaaat 
attcaaatac 
tttatttacc 
atatgtcatc 
attagactga 
ttaaattgta 
gttttttatt 
gaagaaaaaa 
taactttatg 
gtttttctat 
atactttaag 
tgccatgttg 
gctatccctc 
tgtgtccatg 
tgatatacta 
aattatttta 
atattttagc 
gattttggtc 
ttttgaattt 
caattttttt 
taattctggc 
cttcagctca 
gctgctccaa 
gattcctaat 
taataataaa 
tatatatata 
tagttcatat 
tagttgaatg 
acatattttc 
tgattcagat 
attgtatggg 
gcttttaggg 
tcctcaccac 
catccatgtg 
tctttatctg 
attatataaa 



gaatgcccat 
aggaaaaaaa 
aacattattt 
atttaagtga 
taaaaaacta 
aagtgtcaaa 
ttcaatattt 
aactgacttt 
gacatataac 
attctttcca 
attggttccg 
cctccctttc 
aaacctttag 
aactggaaag 
gcttggaaat 
gaagaaaaag 
tgtattttta 
aaactatagt 
atatttttta 
tttatagaga 
ttatccagtg 
tagattaacc 
acattttata 
atttattata 
ctttcttcag 
gagagaaggg 
aatggtgatc 
ataaaagaaa 
tttaaaatat 
tacatagttt 
gtgttatact 
ttttagggta 
gtgtgctgca 
ccccctcccc 
tgttctcatt 
tgtcattact 
catgtggaat 
caccattatt 
aaatattgaa 
tttgaaactt 
agttttttat 
ctcagggcag 
tgactatcat 
gtcactattc 
tttttacacc 
aacacacatc 
tatatatata 
tttgtcatgt 
ttaaaaacca 
atatagattt 
tgaagagtcc 
gtacaagtat 
tagccatcat 
tctgccactc 
tacagagttt 
agttgtttca 
ctccaaaatg 



45960 
46020 
46080 
46140 
46200 
46260 
46320 
46380 
46440 
46500 
46560 
46620 
46680 
46740 
46800 
46860 
46920 
46980 
47040 
47100 
47160 
47220 
47280 
47340 
47400 
47460 
47520 
47580 
47640 
47700 
47760 
47820 
47880 
47940 
48000 
48060 
48120 
48180 
48240 
48300 
48360 
48420 
48480 
48540 
48600 
48660 
48720 
48780 
48840 
48900 
48960 
49020 
49080 
49140 
49200 
49260 
49320 
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cacttaaatg tcctggaatt ctgcaattag atagaatttt ctttatatag caaggacctt 49380 
tctttctttc ttcttttatt tgatttttta aattttttgt agatatttac accctggtaa 49440 
agaaatggaa tataaagcca tagtttttca aaccttcata accccaaatg atagacagga 49500 
aaaaggcaat tttgtgaaaa agaaaaaacc aagttctcag gccaggggat tcggatacag 49560 
taatctcaga tggagtcctg gaattagaat tttgaaaatg ctggctgggc acagtggctc 49620 
atgcctgtat tcctagcact tcaggaggct gaggcggaag gatcacctga gctcaggagt 49680 
tcaaaaccag ctcgggcaat acagtgagct cttgcctcta attttttaaa ataaaattct 49740 
aagtaaaatt ttaaaataaa atgttcaaag taaaaaagtt aaaaattaac aaaaagaatt 49800 
ttgaaagtgc ttatcagaga attctagtca accaggtttg gagattacta gtctaggcca 49860 
ttttcaccta taaatttagt agggattagg taggattaaa ggaaagtaaa caatggtact 4 9920 
tacggactga aagtatttgt atgaggaaga gtcattgaaa gaaaccgtgt tcatcttagg 4 9980 
ctttatccct ccttataggt aggacttttt aaaatatata tatattaatt tctaatattt 50040 
tataatgttt tcatgatcat agagtcttag agctgggaaa tataataatt atgatcttgc 50100 
cttttgccca catttcgcca ataagagtgt ataagtgact cagtcagcac cctactctag 50160 
taagtctcag aatcagaatt atgacaaaat tttcccattc catccaatca gtgttctttg 50220 
tactacacat tactcttact gtaatgcata ctttggatat attcagtatt tgttctattg 50280 
attatctgga aagtatttaa aattatgtac gtaattaaaa gacaatacct tattccttat 50340 
tcaatgtgtt atttttatat ctagcaaagg cttaaaacaa tgtagtaaag gtttgtctta 50400 
gattggcaat gaatttgaac taacaccagc tgctcttcag tttaattaaa atgtgaagca 504 60 
gagaaaactt atttaaatat cttgacttcg gacattttgt tataataaaa attgtaacaa 50520 
attaccataa ctgaaattgt taaaactgta tgataagcct tggatttcct ggtaggggta 50580 
cctagctttg tgacatgtgc aataggattc aaactcctca ggtaaacaga agcagcaagt 50640 
gcttccgaca gcagtgcaaa taatgcccca ctagcaggat acaatgttgg aaagtttttc 50700 
attttctgca gctggactgt ggacaacagt aagcatttta taaactggta ataatataaa 50760 
taatttaaaa catgagtcac gtggactatt tctaaatata tcatttccaa atatagtgtt 50820 
ctctcttgct cagatggaca gactaattca gacatcaaga aaatattttt agtacagggc 50880 
agtcacagat gtagaaacac ttttcccatt cagatcatac agcaattaat ctagctgttt 50940 
agcactgaac caatctataa caaatgttgc acatgtggat agaaagcaag tttgcctatt 51000 
ggctatgaag cttttctagg aagcagaacc ttcatctgaa tatcaatcat ttgtcacaga 51060 
gaacttctta tcatgctgtt aattggtaat aaactgtaat ggggaagtat agtttgagtg 51120 
tgtttataca atagtccctc cttatctgtg gttttgcttt ctgtggtttc agttacccat 51180 
ggtcaacaat gatccaaaag taggtgaaca cagtacagta agatattttg aaaagaaggt 51240 
agagatcaca ttcatagaac tttcagtaca gtattttgtt attaaaatta tattgtatca 51300 
ttcattgttg tcagtctctt attgtgcgta atttatcagt tgaacattat cataggtact 51360 
tatgtatagg aaaaaacatg gtatatatag ggttcccaac tattcaagat tgcaggcatt 51420 
caatggggat cttggaatgt atccccatga ataaagggga tcttttgtat atttgttacc 51480 
atactatgtt tcatacttag gaagcgtgct ccatgcaagt tagacttatt caatcattta 51540 
ttcatccaaa taacccacag tgataacata gatacaaatg ccagaaagta aaaaacaaga 51600 
taaaagaaag gtaagtatga aataagtgac caaaactact aattgtcaaa atataaattt 51660 
catttctctg gtttaaaaat ttttgcctca tatggatgac ctgagttacg gtaaagagtt 51720 
taaaattgtg gagtacatga tcagaatctc ttacacgttg tatgatcttg gaaaggcttc 51780 
cttgtatctc aattaaaatg agggaaaagc aggaaaaaat gagaagatag agcttggcaa 51840 
tatagaatag tcactgggca aaaacaattt gatggaatct ggaaagcaaa tgaaaatatt 51900 
gtgaagggaa ttacattagc tttccctgct taaatgaaag gctatcaaat atatatttat 51960 
ttaaatttaa taaaggttta tgaaaacatt tctaatttat caaaaatagt aggcatgttt 52020 
ttggataatt attttttaaa ttattttatg gcatatgaca actcttgaca taaactaact 52080 
tcatgaaatt ctttactctt ttgccagatt gctagatttt aaataaaaat tatattatca 52140 
taaatatatc taaaatctgt caaatattag tatttttact aagaactgct ccgttttaaa 52200 
tcagtattga atctcagcag aggcaaagta agtttaaaat aggtttaaaa aacaggtaaa 52260 
cacacttcta ttatatctca cttaattaaa aaattcatcg cacatgcgaa gctagcatca 52320 
caatggtttt tctgccctta ttttcttgaa acatcactca atactcttat gttttcagtg 52380 
tagatcactt cttttcagat ttcatggtca ttcagatcaa gttcaaagtg cagagaaata 52440 
aacagagaca ataatctata ggtaatggat ttttaatata ttacatccac tcaattgaga 52500 
aatatgaagc aaacttcaga gagaccaaga gatggaacaa aatgacaaga atacagattt 52560 
tcatagagaa catgagatgg aggaatatgt tccttcttgc cacagtataa attgtccctt 52620 
gtctggaaaa gaaacacatg gataattttt ataaaagcgt tctcaatcca tgggtgtcta 52680 
ttcctactgt taatagctct gtgatggtta attttacatg tcaacttagc caggctatgg 52740 
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tgcccaaata cagttatgta tcacttaagg acagagacac attataagaa atgtatcttt 52800 
agtcaatttc accattgtgg aaacatcaca gaatgtattt acacaaacct acatggtata 52860 
gtctactaca tactgtggct gtatcatata gcccaatgct tctaggatac aaacctgtaa 52 920 
agaatattat taaactgaat gtcgcaggta attgtaacac aataagcatt tgcatatcta 52980 
aacataccta aacatagaaa aatacagtaa aaatatggta taaaagataa aaatggtata 53040 
cctgcataag aggcttatta ctgaaggctt gcagacaggg aagttgctct gggtgagtca 53100 
gtgagtgagt ggtgagtgaa tgtgaaggcc tggacattac tttgcactac tttagacctt 53160 
ataaacactg tgcacttaga atacactaaa ttaatttttt aatttctttc ttcaataaca 53220 
aattattatc agcttactga attttttatt ttataaattt attataagtt tgacattttc 53280 
ataacagctt aaaaacattt tacaactgta caaagtattt tctttgtgtc attattttgt 53340 
aagcttttgc ttttttcaat ttttttttta gctttttcaa cattttgtta aaaactaaaa 53400 
atcctacaca ttagtctaca tggtcagaat catcatcatt actgtcctcc acctccacat 53460 
cttgtcccac tagaaggtct tcaggggcaa taacatgcat aaagctgtca tctgatctga 53520 
taacaatacg tcgtctggaa tacctactga aggacttgcc tgaggatgtt ttacagttaa 53580 
catatataat ggtacatata catattgtat acaaatacat aaatacatca tatacattac 53640 
attataatat atttatatat tatatatata acatatgtgt gtgtattttc atttatgtat 53700 
ttattatgct atactttctg tcattttaca gttaacatat atatgtgtta tgtatatata 53760 
taatacagtt aacatatata tttgttgtgt ctgtgtgtgt ataaagtaga aagaggataa 53820 
tctgaaataa tgatagaaat tattgtatgg taaatatgta aaccagtatt aagttagtta 53880 
tttattacca ttaccattac caaatactat gcactgtgca taattgtgtg tattatactt 53940 
ttaaacaact ggcagcataa taggtttgtt tatatcagca tcaccagaaa gatgtaagta 54 000 
atctgttgtg ctaggacgtt agaatggcca caacatcact aggtgatggg aatttttcag 54 060 
ctgcattgta attttatggt accaacgttg tacatcattg actgaaaatg tcattaggca 54120 
gtgcatgact gtatttggct aaattccagt tcagatgttg ctgtgaaggt attatttaga 54180 
tgtgattcac attgaaatca gtagtttttg agtaaacagt ttaccctcta taatgtagtt 54240 
gggcctcaac cagtcagatg agcactttaa agattgactg aggttcctct aggaagaaga 54 300 
gactcaggct gcagactgcc tttggcttca agactgaatc ataaactctt ccctggctcc 54360 
ctaactacca gtcttccctg cacattttgc actttccagc ccctacaata tcacatgatt 54420 
caactctctt ttccctttct ctctttcatc tatctatccc tctacctatc agtcagcatc 54480 
tatatgccca cctatttatc ccgttggctg tgtttctgtg gagtaccttt actaatacaa 54540 
gctcacaaga atagcttgct ccaaacccag aattcaggta ctcaggtact atatatattc 54 600 
caaggtctgg aagctgactt tgtcctattt tattgtaact tcttaagaac tgggagttca 54660 
ttcatagggt ttataaaacc ctacagattg cattatttca tggtctattt tatatagttt 54720 
aactgttctg tgaatataat gtttgatatt ttactctttt gatttttata cagaatatta 54780 
tataatatat attctggaaa tatagttata tatttccagt aggaaatgtt gattgagcat 5484 0 
gatgcagaaa attaacacat gaagtgctag gattttcagt tcattattct cagtgcagtg 54 900 
ttcagagctg tttctaggct gcagggtcaa tcattggaaa ttggtaattt tagataatag 54960 
ttgctaaaaa gctcttccag atgtaggata ccctgtgtat ttgaggaatg tacattccct 55020 
gacatacaga agttgtctac tttttcatta tatatggctt gcattcttta ttttccatcc 55080 
tgccctgccc tttactgcct ctttgtcctc tcccttccct ccaccattta ctaattggtg 55140 
cataaatcaa aatatttcca aatcttacag gtttcaggtt aaatgcaaca tctgtcgtga 55200 
aatctttctt aagttcccct aactaaaaat aatctctact atttctattt ctctttatac 55260 
cccttctaca ttttattaat ttccaatttt atgatgtata tgttcatacc aagaaggttc 55320 
cattttcgtc tttgtaaacc aagtacatta taaaactaac aagaacactg aaaaatgtct 55380 
gctaaaaatt taactgatga atgcacacga tgctatgtgc ctttggcaat ctacattcat 55440 
agatataaaa agagagcaca aagggttagt agctgaatcc agaatatgtg tctacttttt 55500 
gaatctctct cctgtattgc tgatgacttc aatgatttaa gcaattggca taacctcttt 55560 
gttccctatt tctttttcta tttcaaacat agaataacac caagttttgt atttacctct 55620 
tcctgctaga attcagaagc tggtgctttt acaataatta agcaaactgg agctgatacc 55680 
tataaaattt taccactaaa ggtatttaat tgtcctatgg attagctgta cttgtacttt 55740 
ttctctacat gtctatagct ccattactgc taaaaattat taggtccacc gttctcttct 55800 
ggccatgtga ctagacagtt ttggaagaga ttaatgccaa cataagaata ataaatcagg 558 60 
atcaatcaaa ttttatgatg aaattatgca cctaggttaa ttgctttggc aggtcttgaa 55920 
ttttttgtga agtttctccc attactagaa agacagtaat attgaattaa ctggtaattt 55980 
gaaatgtgag atacagtgct taattgcaaa ctgacaatgt agtaatgaga ctgcactaca 56040 
cacttatcaa ttattatttg attttattaa gttaggaatc aaaatggctc acttcacatt 56100 
tattcactaa tgctgttttg tcagctaact tgcaaggaga tacagcacat tttgtaaagt 56160 
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acattggatg 
tcttgtattt 
cttgtgtaga 
ttgctggaaa 
gattttaatg 
cttataagta 
tgagtctttc 
atttctgatc 
gattttaaag 
ccttagacaa 
gctaagtata 
ctagccgcaa 
acaacataat 
atagtagagt 
ttagcgtagg 
cttggctaaa 
gagtggggat 
aataggtaca 
atttatgtgt 
acgaattact 
cagtatataa 
tttcagtcca 
ataaaatttt 
agggtcatga 
aatgactgac 
cttcattata 
aacatacatc 
tatgtatatg 
cttctcaatt 
aacctaggat 
accacttgta 
atttcttata 
tgcaagtaat 
ttattctttt 
tatacacaca 
tgtatattta 
atatgttaaa 
taatcatcta 
ggacaaaggc 
gcatttgtta 
tgcttattag 
gaaaaataat 
gaacatgcaa 
tggttgcttt 
aaaaaacaat 
gtttatgtga 
cttttaagtt 
aaaaataaaa 
ttttacatga 
taatcccagc 
tcctggctaa 
gtggcgggtg 
cgggaggcgg 
agcgagactc 
ctgagtttaa 
acaactaaaa 
aatagatata 



tcaaatttta 
tcgttctgag 
tcattataac 
ttacactaat 
gtcttcttat 
ttagcttctg 
tcttcttgat 
ctttggaaat 
aaggtcctgc 
attggttctc 
actgccaagt 
tccttactgg 
tctggcaaat 
ttgagccttc 
ttacatggtt 
ttataattga 
taatgataac 
ccagtgatgt 
tactccatta 
atgtattatg 
gcaagggtgt 
tattccacag 
aattatctcc 
tgcaagcatt 
ataatataga 
tgaataaatt 
tcatactcaa 
cttttccaga 
tatttttata 
ttgatccaac 
attttttttt 
ttaaatattt 
caatctct cc 
aaatatcaaa 
catatataca 
tttcctttcc 
tgcctgtaat 
ctcttgaagg 
tttgctctaa 
tggatgtgga 
ttctgtgatg 
gattatagta 
agctcttaac 
aaaacattta 
gtaagggctc 
cttaaactgt 
taatgagtac 
aataataaag 
ctcatgcaaa 
actttgggag 
cacggtgaaa 
cctgttgt cc 
agcttgcggt 
tgtctcaaaa 
ctgttacttg 
tagtagcttt 
atttttgccc 



aatgcactgt 
taaacagaag 
ctgtgaacag 
acaaaactgg 
ttcagtgagg 
caaagggctt 
attagattga 
cataagtctt 
agctgctaag 
aagtgccaca 
agcattacat 
tatcttaagt 
aaagttgact 
tgtctatttg 
agactgaatg 
ttccaaagac 
ttcagtgaag 
tgtgtataag 
atattactca 
ggatgcggtt 
ggctatcaga 
cacagtttga 
ctaagttact 
tttgtgcaca 
tgttaaatat 
attcagaatt 
ttacatagtt 
attgtaacac 
ttccagttaa 
ttccctataa 
ccatttttca 
tccttcagga 
caccagaagt 
agtatataca 
cgcacacata 
tctatggcat 
atggcagaca 
ttctctgtta 
ccttggaacc 
cttgggattt 
tgagaagaat 
gtacatattt 
atatgtttgg 
aaaacagtag 
ttaaaaaatt 
atgtataaat 
atcctacata 
ccaactgtgc 
cagaaagaaa 
gatgagccgg 
ccccctctct 
cagctcctcc 
gagccgagat 
aaaaagttta 
cagaaacttt 
actaatatgt 
aaacctggtt 



tccttatctc 
ttccatttta 
gaacatgaag 
attttttttt 
catttattaa 
gggatggatg 
ggggacaact 
ccaacagtga 
cagtattgac 
tttaatgccc 
tccctctacc 
agtttctgct 
aaatgctctt 
tatttgccat 
gttagtcatt 
aaaacatctg 
ttgtccaaga 
ggtctcttgg 
ataagatgat 
tgccttatga 
tttataaggc 
tctgtgccat 
ctagcacagc 
aaaacttcat 
ataggtagac 
ataagtgtat 
tctggtacct 
tgttcatcaa 
gcaaattgta 
ataagttgaa 
tttattggcc 
aaattggcca 
aacttatata 
tacacatgca 
aatatatata 
tgattactgc 
ttttgtcagg 
attgtggaat 
catcagactt 
agcctgcagt 
tattttatat 
cacagagttg 
cattttataa 
atataaataa 
tggttgtttt 
taatataatt 
ctacaaaatc 
aacaataact 
gtttaggccg 
gcagatcacg 
actaaaaata 
ggaggctgag 
tgcatcactg 
aaataatggc 
acaatacaaa 
tgttactctt 
tgacagtcgt 



aaaatgttta 
tgtttttctt 
tacaagagca 
attacgtgta 
aaattagcaa 
gattatcaag 
gtgagtagaa 
ttaacacttc 
agctgctttg 
tgtgaattag 
aaccagtgtt 
gtttgatcct 
tccagggcat 
taaaattaga 
tgtgtcaggt 
gcctgagaca 
atttttcaca 
acctcctaaa 
taaacatcta 
ctcagcctgt 
ttacatggtt 
cccatattaa 
aatcaaaatt 
attttaaatc 
tattgaatga 
atattttaag 
attggatttt 
aaatagataa 
attataagta 
tctagtttgt 
acttccccaa 
attgctaata 
atacctattt 
aacatacata 
gtacatatat 
tgcattcttg 
atacaaaggt 
ttcctaatta 
cactataacc 
ggctcagttt 
ttctattaat 
caataaagat 
ttgctcagta 
ttggaattaa 
ttgcgtctag 
gttgacatta 
ttgagtaaaa 
acttgaaatc 
tgtgcggtgg 
aggtcaggag 
caaaaaatta 
gcaggagaat 
cactccggcc 
atttttttcc 
aattcaaact 
cttatatttt 
ttaaaattca 



tcttagaaac 
ttgcttattt 
gtttggtcgt 
acatggcata 
tttaaaaggg 
acatatgctc 
aactttttcc 
cagaatcagg 
gaaatcccag 
gaatactgca 
gaatatttgc 
atgcaacagt 
ttagtttacc 
catcatgtca 
gaagaggctg 
caggcaggca 
aggttcaaag 
tgcatgcaca 
cagcgtaaat 
gctttaaaca 
gcaaaagtac 
gttgcaaaaa 
atgtaaagag 
cattcaaaac 
ttacacatta 
ccctttataa 
aaataattat 
aatgaggtag 
gtatgcctgg 
ttacttatga 
ttgtacttcc 
cgacatcaag 
ttcaatttac 
cacacacata 
ataatatata 
ttcattttat 
agataatatc 
cagttactat 
accattatat 
ccatatcaac 
agatgttact 
ttagtgagat 
agttagtttg 
ctttttagac 
tttctttttt 
agataacttg 
ttcacacaca 
ttggatttca 
ctcacgcctg 
atcgagacca 
gccaggcgtg 
ggcgtgaacc 
tgggagacag 
tccgaaaaga 
ttcgagaaaa 
aggaatatac 
atagtaccat 



56220 
56280 
56340 
56400 
56460 
56520 
56580 
56640 
56700 
56760 
56820 
56880 
56940 
57000 
57060 
57120 
57180 
57240 
57300 
57360 
57420 
57480 
57540 
57600 
57660 
57720 
57780 
57840 
57900 
57960 
58020 
58080 
58140 
58200 
58260 
58320 
58380 
58440 
58500 
58560 
58620 
58680 
58740 
58800 
58860 
58920 
58980 
59040 
59100 
59160 
59220 
59280 
59340 
59400 
59460 
59520 
59580 
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gaaagtgctt 
cagaaactca 
ttggaatgat 
gctctcattt 
tgcaaaacca 
tggcaatact 
aaattctttg 
cttaacctgc 
attttataat 
aaggttgatg 
aagaaccatc 
taaccattta 
agtatgaatt 
gttgactgaa 
atttaagact 
atatgacaaa 
ctttatttaa 
tcagcttcat 
gtaacaaaac 
gaaaagaaag 
taaaataatt 
gtgttatgaa 
gctatttgct 
tttacacagt 
taactttatt 
accatacgaa 
taagttgatt 
agtataaaaa 
ttcttgtaaa 
taggttgcaa 
tttgctgtgc 
gccattgctt 
atgccctagg 
gaattaattt 
agccagtttt 
ccccattgct 
atttctgagg 
gctgttttgg 
gctttgttct 
aactttaaag 
gcattgaatc 
acccatgagc 
ggtttgtagt 
tttgttctct 
ctgttattgg 
ttgctgaatt 
tatacaatga 
cccttcattt 
gctcaccatc 
caccagttaa 
gagaaatagg 
aatcagtgtg 
cattactggg 
gtatgtttat 
caatgataga 
taaaaatgtt 
cagtaaacta 



atgcttaaca 
agaaaatctg 
ggacgagttc 
attttccata 
aaaatgtgga 
tttttaagct 
attttcttcc 
agtgcaaata 
actctaattt 
tataaggttc 
cttcatggaa 
taatctgtta 
attatgtgta 
gaggaaattg 
ttgagacttt 
atcttgtttt 
tacttttctt 
aactttccca 
acttgataat 
aaaataaaat 
acaaataata 
tgcatatatt 
tatttttatt 
ccccaagaaa 
acattagatt 
tagcaaatat 
taattaaaaa 
ccgggtatat 
tttgtttgag 
aaattttctc 
agaagctctt 
ttggtgtttt 
ttttcttcta 
ttgtataagg 
cccagcacca 
tgtttttctc 
gctctgttct 
ttactgtagc 
tttggcttag 
tagttttttc 
tataaattac 
atggaatgtt 
tctccttgaa 
ttgaagcaat 
tgtataagaa 
tgcttatcag 
tgtcgtctgc 
ccacttctca 
actggccatt 
aatggcgatc 
aacactttta 
gcgattcctc 
tatataccca 
tgcggcacta 
ctggattaag 
gagttcatgt 
tcacaaggac 



gtaatatggt 
atgctcactg 
tgtatactct 
ctaatattta 
gctcctttca 
cagttattat 
tttgaagcag 
attgagtaag 
tcaatggtgt 
aagtttattt 
aagatattca 
atctgctatg 
aacacagaga 
ttagaattct 
agaatcatta 
aaaatgagac 
tccatctaaa 
tggatgaaca 
ttcatagaat 
aggattttca 
tatttgcttc 
ttattaaaac 
aaataaagga 
aggtgattca 
ttgaaatgtg 
ttgagactac 
agattgattc 
tcttatcctt 
ttcattgtag 
ccattttgta 
tagtttaatt 
agacatgaag 
gggtttttat 
gataaggaag 
tttattaaat 
agctttgtca 
gttccgttga 
cttgtagtat 
gattgacttg 
caattctgtg 
cttgggcagg 
cttccatttg 
gaggtccttc 
tgtgaatggg 
tgcttgtgat 
cttaaggaga 
aaacagggac 
aaagaagaca 
ggataaatgc 
attaaaaagt 
aactgttggt 
agggatctag 
aaggactata 
ttcacaatag 
aaaatgtggc 
cctttgtagg 
aaaaaaccaa 



tggaaaaatt 
gataacaact 
aaggtaatat 
atcaattctt 
catgctaaaa 
tatgtgaaaa 
aacaggttgc 
atttcctcta 
ggtaaaacat 
aactcattga 
taataaacca 
tttgcatatg 
tcatattgaa 
gtcatatata 
ttatattcta 
aggagatttt 
tagattgaca 
ttcctgtcca 
tttaacctta 
aaagtgtaag 
ttgcttttaa 
attctttata 
gtattgaaaa 
aattatattc 
tttcaaaaat 
tttaataagg 
atagagaagt 
tgcccacttt 
attctggata 
ggttgcctgt 
agatcccatt 
tccttgccca 
ggttttaggg 
ggatccagtt 
agggatttat 
aagatcagat 
tctatatctc 
agtttgaagt 
gcaatgcggg 
aagaaagtca 
atggccattt 
tttgtatcct 
acgtcccttg 
agttcactca 
ttttgtacat 
ttttgggctg 
aatttgactt 
tttatgcagc 
aaatcaaaag 
caggaaacaa 
gggacagtaa 
aactagaaat 
gatcatgctg 
caaagacttg 
acatatacac 
gacatggatg 
acaccgcatg 



tggattcctg 
cctaatcgtc 
tggtttaaaa 
aattttttat 
tataaaatgg 
ggactaaaag 
agcataaaga 
gaaacccacc 
ccctaagaaa 
cttaaacata 
agagaaaaaa 
tgtgtttgct 
atgtatacag 
aatattataa 
ataggtcaca 
taatgcttca 
attaatgcag 
ctggaatttt 
gaagctgttt 
ttatgttcaa 
actaaaagtt 
acttttaatt 
taggtttgta 
atcttcctgg 
tggttaggca 
tagaaaaacc 
ggttagaggt 
ttggtggggt 
ttagcccttt 
tcactctgat 
tgtcaatttt 
tgcctatatc 
ttaagtcttt 
tcacctttct 
taaatagggg 
agttgtagat 
tgttttggta 
caggtagcgt 
cttttttttg 
ttggtagctt 
tcacgatatt 
cttttatttc 
taagttggat 
tgatttggct 
tgattttgta 
agacaatggg 
cctcttttcc 
caaaagacac 
cagaatgtga 
caggtgctgg 
actagttcaa 
accatttgac 
ctataaagac 
gaaccaaccc 
catggaatac 
aaattggaaa 
ttctcactca 



acactttctt 
tgcttcgaat 
atctgtagtt 
ttctcaaagt 
tgaaatactt 
caaagagaaa 
ttatggtttt 
ctgctataaa 
ggattaaatg 
tctgcttatt 
ttaccatact 
gacatattga 
aggtaaccta 
tattattgct 
atctattatc 
aattatttat 
aatgttatct 
caagctgtaa 
atttaaaaat 
ttcttttaga 
tacaataaat 
ttattccaat 
ttcaaatgtt 
aggcaagatt 
aatagtattt 
aaggtagaaa 
gaataattag 
tgtttgtttt 
gtcagataag 
ggtagtttcc 
ggcttttgtt 
ctgaatggta 
aatccatctt 
acatatggct 
aaaggatttt 
atgtggcgtt 
ccagtaccat 
gatgcctcca 
gttccatatg 
gatggggatg 
gattcttcct 
attgagcagt 
tcctaagtat 
ctctgtttgt 
tcctgagact 
gttttctaga 
taattgaata 
atgaaaaaat 
taccatctca 
agaggatgtg 
ccattgtgga 
ccagccttcc 
acatgcacac 
aaatgtccaa 
tatgcagcca 
tcatcattct 
tagatgggaa 



59640 
59700 
59760 
59820 
59880 
59940 
60000 
60060 
60120 
60180 
60240 
60300 
60360 
60420 
60480 
60540 
60600 
60660 
60720 
60780 
60840 
60900 
60960 
61020 
61080 
61140 
61200 
61260 
61320 
61380 
61440 
61500 
61560 
61620 
61680 
61740 
61800 
61860 
61920 
61980 
62040 
62100 
62160 
62220 
62280 
62340 
62400 
62460 
62520 
62580 
62640 
62700 
62760 
62820 
62880 
62940 
63000 
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63420 
63480 



ttgaacaatg agaacacatg gacacaggaa ggggaacatc acactctggg gactgttgtg 63060 
gggtgggggg aggggggagg gatagcatta ggagatatac gtaatgctaa atgatgagtt 63120 
aatgggtgca gcacaccagc atggcacatg tatacatatg taactaacct gcacattgtg 63180 
cacatgtacc ctaaaactga aaggtataat aataataaaa aaaataaaat aaacaaattt 63240 
gttccacata aaaacaaaag aaaacaaaac caaaaaaaaa aaaaaacgta agtggtaact 63300 
tatgctattt ctatattgct caatgttgta tttttaactc ttcattcctg tagctgtgta 63360 
gtggtggttt actttcctac tgtataatat cagttatatt aaaatgcttt tatttatcct 
aaaaaaaaac ctggtatatt ctttatagga aataattata acctttccag agctatatta 
attttatttt atattcacta acatacgtat cctcacatgt atttttataa aagtattttt 63540 
ctagatacat acaattatct taaaaatatg taaaatggct tataaaaatg tggttgcctt 63600 
tacccacaaa ttgtctaatg ctattttctt tcttttgaaa tttattgttt tcacagatca 63660 
cattctagaa tttggtagag tgtaatatga cattttattc ataagcataa ccctacttct 63720 
tcacttacta aatttccaca tttttttttg ccttttatta atgatattct tgaattttct 63780 
taatgatttt gtctcattaa agcattcttt tatattaatg atctacctca ctgctttgat 63840 
ttttaaacag taactttgaa gtggataagg cagccaaata gcaacaaata cttgattgat 63900 
agaattcaag gaactcaagg aaaagttcat atgcaataat tgatgtagca gttgtacttc 63960 
ttcattctag aagccattct taaattttgg tgtgatagct ctctggtttt gattgttgtt 64 020 
aatagtaatt aatcatttat ctttgtgagt ttaattgggt gtcaataaca aacatataat 64080 
aatcatagct aatttttttc aggcactcta aatgggtcca gtagagaatt cagccatctg 64140 
catctaatat ttcacaacaa atatgtgtgg aagatttcat tgcagtccca attttaaaaa 64200 
ttaatatcat gagattcaga gtggctattt agccaatcat gtttatttgt gcccacacta 64260 
ctgtatacta tgtctatcct ctattgactc taaaactttt taaaagtaaa gacagaaact 64320 
atagaaatca aaatggctac aacccgagag agcaagtgaa agaagatgga gaaaaggaaa 64380 
gataaaatag tttatttatt aagtagatca gcagagatga taactatttc attgcatgac 64440 
taatagagat cctagtgcaa aatacatgga tatactattt ttgaaatttc aagttaatta 64500 
gtgaattctg tttatgtaat gtgctaccaa tctttcggtt gaaaacaaaa attttatacc 64560 
attaaaatat tctaaaacat gtatgtgacc atacgtagaa agaacttttt agctacaaac 64620 
atctacatga aaactgtaaa ccagaaaggt aaagttgcag ccgattttga cctgaggaca 64680 
ttgacaaaac gtggtaattt ccttcaagct ttgctttgat ggtattgtaa gatggtagtg 64740 
taatagaaca taatctccac ataaagttgg catacaaaag cctgaccagt tgagatcacg 64800 
tgaagaatta cctcctaaac acacacacac acacacacac atacaaccac cgaaagctaa 64860 
aagggaggtt gcccctgaaa tcagcataca gggtgagatg attcttcttg agagaatcta 64920 
attacaagtc tacctcaaat agatttttag cccaaatttc tttcacttgg ttcgttcaaa 64980 
aagtcctcaa agatattcct gatacaagcc agaaacagaa atatttttta gaaagatgaa 6504 0 
aggatagacc tccgttctag acagctacaa agtagacaaa aacattttcc ttaacaaagg 65100 
actagtatct agaatacata aaacattaat ataaaaagag aggaaaaaat aagagaaagt 65160 
aaaaaataaa taatgggcaa ttgaacaagc agttcctcaa agaagcaatc caaaattcta 65220 
ataaacatac tttttaaagg ctaggtttta ttattaatca ggagaatgtg aaataaaata 65280 
cacttcatgg acgagtaaaa atgaagaaga ctgcatttat caagattaat gagaatatgc 65340 
attcatgtta actttcacac tctggctgat aagagaatat gctagtggct aggcgtggtg 65400 
gctcatgcct gtaatctcag cactttggga ggccgagggg agcagatcac ctgaggtcag 654 60 
gggttcgaga atagcctggc caacatggca aaaccctgtc tctaataaag ataaaaataa 65520 
aaagaaaatt taaaaaaaag ggtagccatt cgtgttggca agcacctgta taggtctcag 65580 
ctacttggga ggctgaggca tgagaattgc ttgaacctgg gcggcagaca ttagcagtga 65640 
gccaagatcg tgccactgca ctccagcctg gacaacagag tgagactccg tctaaaaaaa 65700 
aaaaaggaga atatgttagt atactctcta tggaaaaaag attggcacta cctcataaag 65760 
ttgatattat gtgtaaccca tgacccagct actccatact ctatcctaaa gtaatgcttg 65820 
catattgcat gccacgtttc gggaggcatg tataataatg ttactagagc aatgttggtt 65880 



ataqccaaca gaaaatacca caaatgtcta aaacagtata atagataaat ttcggtgtgt 65940 

■ • J-~ ^ ^ _ _ +- -^+- = ^ o a 66000 

tgccttaagt atgggaatga tcaaataaaa ttcaaaggaa cccaaactat gctatacttt 66120 



ttttacaatg gaagaaattt agctacgtac aactttgatg actatcacaa tatattaaaa 
aatacattct tagacaaggt agcatgagga aaaagaagaa aaaaataata tgccaaagaa 66060 



tcagaaatgc atatgaaaaa tctatattaa aaagaaaaga tcatcacaga aattaggata 66180 
gttattacct ctagattggg gagggagtac tatattgggg tggacatatt gggttttctg 66240 
tgtgttagca agtttctatt tcttaagtag attacatggg ttatcattat ttgatccatg 66300 
tatgattagt cattaaactc tgcttattaa tatttattgt attcaaatat tctatgtttg 66360 
gcatatcttg taataaaaaa agaggagaaa gttccaagaa tctacctttg gataatgttg 66420 
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67980 
68040 
68100 



gcattgcaat ggtatcaaat ttggatcttg ctaatctatg tattttttct ctatagatgg 66480 
ataccctaaa aatgaaattg agtataagtg gaaaaagccc tccgtagaag tggctgatcc 66540 
taaatactgg agattatatc agtttgcatt tgtagggtta cggaactcaa ctgaaatcac 66600 
tcacacgatc tctggtaaaa agaatactca aatagtgaaa tgatgctagc attaaatgac 66G60 
atttttattt taaatatatg gattgtgatt tatattatat agccaaaata aaactttttt 66720 
ttcttacagg ggattatgtt atcatgacaa ttttttttga cctgagcaga agaatgggat 66780 
atttcactat tcagacctac attccatgca ttctgacagt tgttctttac ttgggtgtct 66840 
tttggatcaa taaagatgca gtgcctgcaa gaacatcgtt gggtatgaca tgtaatatta 66900 
tgctataatg ccatagaact ttaaaaaatc attttgatgt gataaaagtt catagtcaca 66960 
tttattagat atattcttct agactacagt gcagggaaaa aaattgacag gaactaaggg 67020 
acaataccat gttgaaagag atatgtaaaa aaagaaaagt agaaaatgga attgtaaaga 67080 
acttgaaatt tcagaaaatt ccttcttttt tctgctaatg aagaccaagg gaagcataga 67140 
tccataacaa gaattaatac atgaaaatca tacgtaattc ctaaatctta ttttggttga 67200 
gtttttctat cacgtttcat caagatttta ttttatgtta tagctgagtc actaataacc 67260 
actgtttatt tactgaggtg tgcagatggc tttaaaataa cctgctttta taacattgct 67320 
aatgaagtca atctttccaa agggcaagta gctgaaagga acaagacgct ccctggcttc 67380 
tgtggcttct gtcatgtata aaatgtgtat ttattctgtg aatacttgag agttgcctgt 67440 
agttgcctgt agaataacat tgattaacag ggtaatttga ttcatacttg acaaaataga 67500 
cttttactat ttttctgtaa caatctattg taataatcaa atcacataac aaaataagtg 67560 
actaattaca gtatagttat ttaaaagaaa atgataacct gcaatcaggt attgatatag 67620 
gaataaaact tgttaatcaa ttgattcacc ccccaaatca gtgaaagaca catgaatcac 67 680 
actgctcata attgtgtctt aaaaatatgt aaagaaaggt gagacagttt ctatttttaa 67740 
ccacaaagaa gtttatatgt gatttctatg atagcttata aatgattcca tctaataagt 67800 
acagtaaaat gcaggtcaca taaaagtgac aacaacaatg catgaggcac atgggtctat 67860 
acttgaaatt ctcttggcca tctaaacagg gatattgaca tagctagtgg acaaaatcca 67920 
ttcgtatctc aattttgttt cctttcattg tttaaaatta ttttataagt tgtataaatt 
atttgaggtt ctttgtatgc tactataact taaccatagt atttttgttt ctatttccat 
atttttctgg gaatgtaaat tagtataacc actatgcata actgtttgga agtttctaaa 
agaatttgaa tggattattt ataccactgg gtttgtataa agttctatgc tagagtgtca 68160 
tgaagagaaa agccaactgt gaagtcgttt ttgcataaac ttatttttat tcaatattag 68220 
catatcaatc ataaaggtaa gcacagccag atgtaacttg acaaatatac tcttgtaatt 68280 
gtggacccta agatagagaa ttacaattgc tttaaatata catgcacaca catatcactg 68340 
caagacaatt tcctgaactc aaccttgcat ttttctacaa tgtatccaat gctttctgtt 68400 
aaacaagcag tgtttttggt gctttagtgt taacagtgaa taaaatttta aaacttcgtg 68460 
tttatggtgt ttacatttta ttagtagtaa aacaggtaat gtacaaatgt aaaaggcata 68520 
caataaaatt ttagtgagtc atgaaggctg tgaagagata gcaaaatcaa tgtttagaaa 68580 
atgacagggt gtgagtggta tgatatttgc tattttatat tgtggagtct caatagatac 68640 
ctctgatgat gtggcatttg agcagtgacc caatgaagtg tggtgctaac ctggtggata 68700 
ttgggcagaa agatattcca ggcaggggga accatatggt ttttgaagcc agatcttcct 68760 
gctatgccat gctttggggt tcagtaaaaa gacaaatgtg gttggagctg aatgaatagt 68820 
cacagcaaaa taaatttgga gaaggggcca atagttagat caggtaaatt tttgtgggcc 68880 
attttaagga ctttaaattt tatttcaaat gtgatgggat actattgtag gatcttgagt 68940 
aggggaagcc atggagaata tagcaatgat atccaaaatg ttttctcaat atcctgttaa 69000 
tatcttaagt gttttttatt atcacggact cagaaatatt ttaatagaaa atccactatg 69060 
ttcttctcaa ggtaatgaca gtgataataa ttactaaatt aaataaatgt ctgaaagcaa 69120 
agttctgctt agtttttttg ccttattaaa aaagaaagta acagaaactt aataaagtaa 69180 
aatttattgc agtagtggat aacaaaatga aaccacacat tggatttaag atttttcaga 69240 
taggagacat aattgtggag ccaagaaaca ctttggaaac ctacctgggc tgaagatggg 69300 
aacaatgtga tttaataagc ttgtgtggga gatgaggcaa agggagactt gtatatatca 69360 
tttttttagc ctttaccata acaaagcctt taaaaatgta catttagttt caaaataatt 69420 
gaagtaaaag ctgtaaggaa tagtgggaga aataaacaaa tctgcaacca tactgttttc 69480 
tcattcattg ttagaactac aagaaaaagt taattaggat atagaaactt tgaaaagaaa 69540 
ttatcatcaa ctactttacc tgattgatat ttataggaca ctacacctaa cttcagagca 69600 
cacaatattt tcaagtgcat atagatcatt aaacaagata ttactaaaat acaaaacact 69660 
aaaattttca gagtatattt tctaacaaaa aaggaagtaa attaaaaata aataaaaata 69720 
atataactga caatggcctt gtatatttgg aaatttaaaa aaaagacaaa taattaaatt 69780 
aagtacattt tgtgtatatt tactacaatt acgaagcaag atattactgt gtttgttttt 69840 
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tttaaacgag 
agtcagagat 
ttaatatttt 
tttttagggt 
ccaaagagac 
tataaatatt 
aacaattccg 
ggttaaaaga 
aagaaccaca 
gacccttggg 
gtgctttcaa 
agattaaaaa 
taattttgac 
aagaatatct 
aggcgataca 
gtgagtaaat 
attgtattcc 
tgtctttagg 
agcctgtagc 
tgccagctaa 
tcaacttgca 
tggacacaat 
tttgtttcta 
ggtctattaa 
gactatgaca 
tgcgatggat 
tggaaccttg 
gctaaaaaat 
tacttagaga 
acccataacc 
aagtgtattt 
tatggcctga 
acctttgatc 
gtgatgaaga 
gttctttttt 
accttaactt 
actcatgtat 
aagtaatatc 
atttcatgta 
cctcagtaag 
cgtgtgaccc 
tttctaaaat 
atactcaaat 
taccttaatg 
aatgcatttt 
agttatgttt 
gatacccaca 
gcatt gtaga 
tatattttgt 
ttagataaag 
tctggatttt 
gttttatttc 
caggcaatgt 
taccactatt 
atcagtttga 
agagctggat 
tatttattaa 



aggcacaaag 
taaacctaaa 
ctccaatcct 
atgtttcaat 
tgttgctaca 
ggtcaaatta 
ctcatggggc 
aatattagca 
caaagactga 
gacatcccag 
agcagtattc 
atgggtttaa 
tgtatatgta 
gttgcttatg 
atgcattata 
ttaatttcca 
agcagaaggc 
atattgtgtg 
cacactcaac 
atatttgata 
ctccgcagta 
gttgctatgg 
ttgtttaatt 
ttttcatttc 
accctgagta 
ctctttgttt 
cattatttta 
aaagcctcgg 
tactcttttg 
aaagacaaag 
ctgtttttca 
attttcccac 
cgattttttg 
ttcctcttta 
tgtttttctg 
atatcacatc 
ggggttattg 
taacacatgt 
tgcttagagc 
cgaaagatcc 
ttcctttata 
taacaagaaa 
aaatgagaaa 
tatatataat 
tccctaatct 
tacttttcat 
atattccaag 
attagccagt 
ctcctgaatg 
aaaaatgaat 
ccattttatc 
tggaaatttt 
acaatctcac 
atattatcga 
ccgcagatca 
tactctagtg 
ggacttccat 



gagttgaaaa 
tttgtatcat 
aagaaatgat 
ggagacatga 
gctgtttttg 
agaagcattt 
catatttact 
tgtgcatctg 
tgagaaacag 
gtatacatgc 
ccttgatata 
ctcaggttat 
tggtggaaaa 
tgaagtctag 
gtatgagaga 
atattatcta 
aatgagaagg 
ttgcctacat 
tgcagagaga 
ttctattaca 
atttaaatga 
ctaccaataa 
tcactcacaa 
ctgttttgtt 
caattgccag 
ctgtttgttt 
ccagcaacca 
tatgttaaaa 
aatatttatg 
ttaagatgaa 
aatgtctccc 
agtggtagat 
tgttaggatt 
atggtagtgt 
tatttgtact 
aagtctaata 
actatgatta 
tgaatgttcc 
aactctctaa 
aagactcaga 
tatcctgttc 
acatttaatt 
tgtatcaaat 
gagttttcca 
tttttcattg 
gttatattcc 
aaagcagagc 
gactccgtca 
gacaaatcag 
gctttctaca 
gtttacttat 
gagtggagat 
atccattctc 
aggacacaac 
tgtacactca 
tagactttaa 
aatctaggca 



actttgttaa 
aattccaaat 
aaaagtgaaa 
ggggaagagg 
atttactttt 
agcattatgt 
gggaaaacga 
taatttagga 
atcccttcaa 
tgacttctca 
actgtgccag 
tatataacaa 
tgtcaagtca 
gggtaggtat 
ctcaggctcc 
atgttttaag 
aaagaagggc 
tcttgacttg 
tggcaaataa 
atggaggaag 
ttagaactca 
ttattaatat 
tattgccaaa 
tcctctctct 
gaagtcttta 
catttttgtt 
aaaaggaaag 
agaaaaattt 
tgtttgttta 
actggagtac 
attggatccc 
tctttgtacc 
tgcatttcta 
ccaactgtgt 
attttactct 
tattaaaatg 
tgttttatat 
agacactctg 
gtagaggttc 
cctggatcct 
tcacagggct 
tagtaaatgc 
agccaggcat 
tattatctaa 
gtttctttat 
aagagaaaaa 
acattgctag 
acttcacgtc 
aagtcaacta 
attttccttt 
ctgtatctat 
gtttagaata 
taatttcatt 
caaaacatgg 
gccaaagttc 
acttatcatg 
ttgttctaga 



gatcacagag 
tcatcacaac 
tgacaggttt 
tggttattgc 
catcataaat 
ttagtacttt 
atgaggatta 
gacctaagat 
ggatacacat 
aaaataaaag 
tattcctcac 
aaagctatgt 
ctgtcacttc 
ttcttttttt 
ttctggcttt 
atagtgctat 
attccttgtc 
gttccctttg 
agttgttctg 
ggaagaatgg 
cagaaacaca 
aaattactag 
tcagatggaa 
ataggtatca 
cctaaggttt 
tttgcagcct 
actgctacta 
ctattttata 
cagaaatagg 
ctaacattgc 
ctgtgtaaag 
tcaaacatta 
acaaaattac 
ttaaaagtga 
cccatctcaa 
gattctcctg 
tatcaattat 
cagagcattt 
aatgatttac 
ttttggtgcc 
taaaaaaata 
tttcaaagat 
tagtggcaag 
ctttttaagt 
aaaactctgc 
cctggtgtat 
cagtacaaat 
atttttgctt 
taatatccat 
catcaaagcc 
gttggattat 
aaaataacta 
ctaagaataa 
ctactctagg 
caatggcaca 
ttattcaatc 
atagttacta 



ctaatagaag 
tgtttcacct 
gtgagtagaa 
cttcactata 
gttgtcactc 
taagagagta 
tgaaagttta 
taatagtgat 
acaggattgg 
gatttactct 
ttaaacattg 
tagtttggat 
aattagatag 
tttaatttca 
ctcctgtatt 
tggtctggcc 
tctttaatgc 
gctagatctc 
ggcagtcata 
atatcaggag 
aagagcacat 
taatcatatt 
gaataaatgt 
ctacagttct 
cttatgtgac 
tgatggaata 
aagacagaaa 
aacctcaact 
gacgcttgag 
tcaccaagaa 
agtggtttat 
tttgtgattt 
cttactttta 
tgaccacaac 
tcatataatt 
aatctaatat 
agcaacctga 
tacatgcatg 
acaagaacag 
agagctcaag 
gacatttatt 
ttgtttcaaa 
cttgtaagcc 
tctaggagtc 
tatactttta 
tttgctttca 
ttgataactt 
atctatagga 
tgttacattt 
atgcatcttc 
tgccttgaaa 
atatagaggc 
ataatttagg 
acggtagcta 
cttactcaag 
attcatcaac 
tttttaatcc 



69900 
69960 
10020 
70080 
70140 
70200 
70260 
70320 
70380 
70440 
70500 
70560 
70620 
70680 
70740 
70800 
70860 
70920 
70980 
71040 
71100 
71160 
71220 
71280 
71340 
71400 
71460 
71520 
71580 
71640 
71700 
71760 
71820 
71880 
71940 
72000 
72060 
72120 
72180 
72240 
72300 
72360 
72420 
72480 
72540 
72600 
72660 
72720 
72780 
72840 
72900 
72960 
73020 
73080 
73140 
73200 
73260 
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atatgttaag 
ggtaaagaac 
aattgacata 
aatgaagttt 
ctaaatgata 
tcagtgtaca 
taggcaactc 
tttttctttg 
gacttcaccc 
cccatttctt 
tgtttagggc 
aacaattctg 
ctgagaatct 
tttagcttca 
ccacatgaaa 
catttgtctt 
tttccttcct 
ccttcccttc 
tgtaaagagt 
ttgtgacaat 
ttttttctac 
ccttccttcc 
ttttattgta 
ttactaatcg 
tctaagaagt 
aagccactag 
tagtttatgt 
gttgtaccta 
tttcctgatt 
cattatggag 
gttgtgaaaa 
agataaagat 
tctcatgctt 
tttttaaata 
gttcttgtgg 
ccaggcagtc 
ttgaagagaa 
cataaaacat 
actgaagtaa 
tttaagatag 
actagcatac 
gttgggttaa 
tcacatttca 
atgtagaaca 
agaaacattt 
gccaggttgt 
aggcaagtta 
aattagaaat 
agactattgc 
atgaaaataa 
ctcacccaca 
ataaacattt 
ttcacaactt 
gtgatattgg 
gtgcatggtg 
ccatatgatg 
ggctaccaaa 



tctttcaaat 
gaaaattaaa 
ttataattga 
gatagtctct 
ttttgagtga 
gccatatggt 
tgtttgttgt 
gttgcaagat 
cctccttccg 
tattaaaatt 
tatagtcagg 
ttattaagca 
caaaatgttc 
aagaaccacc 
tgggccagct 
ttatgctcgc 
tccttccttc 
tccttccttc 
cggtgtatgg 
ggccatatga 
cttggaagaa 
ttccttcctt 
catgtgaatg 
gcaagtatat 
taagttaaat 
taataaaccg 
tgaaacttga 
cttcaggggc 
ttattaagat 
acaaaaaaga 
aatgcatcgc 
caaaattaaa 
tttcagaagt 
acaatatttt 
aaacagacag 
ttcagacatc 
ccttgtggag 
ataaactgat 
ttttttgaag 
agggaactat 
gtcccaagct 
attaattaaa 
agtgctcaat 
ttgtcatcac 
tctcttagga 
gcaaaatctt 
tacacgtttt 
agtatccagt 
agttgttcca 
tgcattagca 
gctccttttc 
gtctacctct 
atagattctt 
tgaaatgtga 
gaatgttttc 
cttccctgga 
gaacactgct 



ttagcatcaa 
tgatagtatt 
ctacttgagt 
tttctttgtt 
aaaaccggga 
cttgctaatc 
tgtctcaaat 
gttgtagcat 
cctttctctc 
cagagtagtt 
ctaggatgca 
aaattaaaat 
agtttttttt 
taaaaatata 
tgcttgctct 
tttctatttt 
ttt ctttcct 
cttccttcct 
tgggatattt 
tgcttcccct 
ccttccttcc 
ccttccttct 
acattttgga 
aattcaggtg 
atgttgaact 
aaaggcacta 
atattcggcc 
ctgctatagc 
gtgaaattat 
gatagaatgt 
aaatacaaac 
caaggagtct 
agggtatttt 
tattttttat 
gtatcacaaa 
atattatgtg 
aggagaggat 
ggaagtttgg 
aaaaagttgc 
gtcagctggg 
gtgctgtcca 
attaaacaaa 
agccacatgt 
agaaagtgct 
aagaaaataa 
taaagtcatg 
taagcaggat 
atatacttgg 
gaagtaatat 
agtttttggg 
agcaaagaga 
cagagtacta 
tcttataata 
aatacagcaa 
aatagataga 
gtgtatagat 
taatatttat 



gcctaggttg 
gaagtactaa 
accaaaaata 
caattatgtg 
tcagtaatca 
aggtaaatca 
cctctattta 
cttgccatta 
tcctttcctc 
taaaatccag 
atactttggc 
ccttttcctg 
ggaagagata 
aattgataaa 
ctctcactct 
tccttccttg 
tcctttcctc 
tctttccagt 
tcaatagata 
gaagtatata 
ttccttcctt 
ctattccatt 
ttgtttcata 
tatttgtgcc 
atggaatcaa 
gtaataaact 
cataggctag 
catttgagga 
agaaatttat 
gttctttgct 
ataataagat 
gtgagggaat 
tataacaaat 
aaaaacataa 
gccagaccta 
gcctgtgaaa 
tcctttcatt 
caaggaaaat 
agttttcaag 
ctgagcctta 
atataatagc 
atataatatt 
gacttatggc 
attgggtaat 
cctatgttga 
ctaagaattt 
ggtatatttc 
atggcaaaag 
aatactaaaa 
tcccccctgg 
ttttcatttc 
aaataatgaa 
gtatttggca 
gaaggttact 
gtaaccagtt 
accaccacca 
atttctattt 



caaactacta 
atggcaaata 
ttttctttac 
ctttatgagc 
aatctgactc 
gaatgtactt 
agggctttct 
agcctgtatt 
tttgtctttc 
cattttccct 
tagagagttg 
acatcttgtt 
atacttgaac 
cacccgtttt 
ttctttcatt 
cttttccttt 
cctttccttc 
aagaaggtta 
gagtaaccat 
gataccacca 
ccttccttcc 
atgaccaaat 
tgcccatcca 
aacatttatt 
gacattttat 
agcctttgcc 
cagaaatttc 
ttgaacttac 
ttatccagct 
ttcagggatg 
acaatataaa 
taacaattag 
atacataaga 
aatgtatttt 
tgtggcctat 
cattgaacag 
tcatgtgtaa 
tccagagaaa 
ataacaaatt 
catgtgagaa 
cactaagcac 
cggtatctta 
tatcatattg 
gtttatcttg 
ggatagaaga 
acattttgtc 
tatgtatgtt 
aggcagggag 
gctgaaacca 
caccgcctca 
agtcagtgca 
aagctgtatc 
aagctcccag 
taaacaattg 
agtaccagtt 
acagctcttt 
aaaacatcat 



attatggaaa 
gcaacagtta 
tgccattcat 
tctttcagtc 
cttgcatggt 
ttatgaggaa 
ggaatccaca 
aaatgtttgt 
cttcaatctc 
ttaaaaaaaa 
ctgaaaggtc 
gactgcttaa 
aaaatagatt 
aacttactct 
ctctttcttt 
ctcccttccc 
ccttcgcttc 
cttaaacaat 
ttagtaccag 
ccaacagctc 
ttccttcctt 
ttttcttcca 
atatatatta 
ttcaagaaag 
gcaacagata 
aattgaagag 
agaaaaccat 
ttatttttag 
atggacgaga 
attacatcca 
taatattcca 
atgaataatt 
tattttatta 
taaaggaact 
gcctagcatc 
gcacacagtg 
ataattcatt 
gtgtggcttc 
agaaagaaca 
caaaaagtag 
atatagatag 
gtagcgctag 
aagagtgtag 
aaagtaggaa 
gtggtttatg 
ttgttagcgc 
ttcagaagat 
ataaatttgg 
agatattagg 
ccccacagtc 
tgcctattga 
aatccacagt 
gaaggattat 
taaagagtca 
gtgacaatgg 
tttctacctt 
agactatcac 



73320 
73380 
73440 
73500 
73560 
73620 
73680 
73740 
73800 
73860 
73920 
73980 
74040 
74100 
74160 
74220 
74280 
74340 
74400 
74460 
74520 
74580 
74640 
74700 
74760 
74820 
74880 
74940 
75000 
75060 
75120 
75180 
75240 
75300 
75360 
75420 
75480 
75540 
75600 
75660 
75720 
75780 
75840 
75900 
75960 
76020 
76080 
76140 
76200 
76260 
76320 
76380 
76440 
76500 
76560 
76620 
76680 
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agcagaaagg gaccagatta atcattgagt aaaattccca tttaaaaaaa ttaattttaa 76740 
cagtaatagt ggagcatggt tgaaactaat aaacaaatag aacaattcat aaataaacaa 7 6800 
atagaacaat tcataaatgt acaaataaag aaaattctct acctagaaca ttttgtccaa 76860 
aatctcagaa tcaactacat tgtaaccatt tctgatttta cattcttgta attattattt 76920 
tcattgtatt tctgattctt actgtaccaa atttccaggt atctataact ctttggtaca 76980 
aaagaaaaaa tagctcttct gttgtatctt tcttattctc tttctcagta ttttaactta 77040 
ctattaatag ttattgacct ggtgtattgg ctcactcctg taattccagc actttgggag 77100 
gccaaggtgg gcagatcacg gggtcaggag attgagtcca tcctggccaa catggtgaaa 77160 
cctgtctcta caaaaataca aaaaaattag tcaggcactt ggttaatttt aaaactttaa 77220 
ataatatatt taaatctcta cttctgaatt cattattgtg attaattcta gtgccaattt 77280 
tattatcatt ctttgcagat aaaacttttt tcttttctta agaaactttc catcatcttg 77340 
tttttttaca gtgaaaaaaa aatcttaagg agtcgtacat tttctgtgtt ttgtcctctt 77400 
tcatttcctc attacttaat attcttatgt tttatttttg agaatgttat caattgttca 77460 
ttccaattaa tttctatttt ctccttggaa actactggta gttagatttt tgacttctta 77520 
ggttcagcat gtattccaga ttttcctttt atcatatttt aagccatttt tccatactat 77580 
gagcagtttt ttcaagattt cattttaaca cttttagctt attaaaaatt tacaataact 77640 
ttatgaattt cagagaggtc ttattttcta actgttcttt gttcattttt ccttaagtgg 77700 
atgtattact tcctcaagta tttctgagtg gactaactag atttattttc tcatatttcc 77760 
taaaatatta gaatcagtaa ttccagctgt tcattttcac ttttctcctt tgtattgttc 77820 
acttacctca aaacttcttc atgattgttc cccaaatagg cccggtaagt ttcgtatgta 77880 
actgtcaagt ttgcaaccat cgtattttac tttcatacgt gtgtttgtgt ggcatgacat 77940 
ctgattcctt ccaaatatac acatatacat tctaaatgat aggatgatgg cacaatacta 78000 
ggacattcca gttccaacat caagagccct aacctttctc ctaggatagt gaaagtttct 78060 
tctcattttt tagaagttat agagtttaga gatgagtgaa tgccaaccgt cgtcatcttt 78120 
actctctaca tgcacagtag gtgaaccata gttcaaagga gtgcagctga tctggaggtg 78180 
gacatttagc taacaccatg ttttccattc tattcattac tcccattctt ccttccttat 78240 
actgagtgga agagtagatt ccacagtcat gtttccattt actgagcctt tctgagatgt 78300 
gtcttctttt ggtctgcttt atccattggc ctatttccat ctgttttttg tacttcagaa 78360 
attgatcaga aattctaaat gctgattggc agctcctcgt caccaagatt cacctcctgg 78420 
aatttgtttt ctgtttttta agacaaagtt tcactccatc acccaggctg gaatgcagta 78480 
gtgcaatcac agctcactgc agcctcaacc tcctgagatc aagccatcct cccacctcag 78540 
cctcccaaat agctgggatt gcaaacacat gcaactacac ccagataaat ttcttttgtt 78600 
tcctcttttt gtggatacag ggttttgcca tgttgcccag gctggtctca aactcctggg 78660 
ctcaagaaat ctgcccacct tggcctccca aagtgctggg attacaggtg tgagccaacg 78720 
tgcccagcct tctcctgttg attttttaat gttgatggta taatctacaa tttttgaaca 78780 
tcccaactta aacaggattt tgtgaaggag tggtaaacag tgcttagttt ggcattttct 78840 
catgcaatat tttaaatggg aaattaatat tatatagtca gtatttcata ataatataat 78900 
catcaaataa tatttaccat tcatattact aatatttaat aattttagca gatggtataa 78960 
taattatcag atggtataag taataactat atgtaaatac tgatttattt atcacttccc 79020 
atttatttga tattagtata ttttggtgct taaaatatat tttattagta tataactaga 79080 
cagagatata tccttaatag tgaaaatttg atgaatttct caaaatagtg agctcagcta 79140 
tattcttagt aatttttgaa attttttaaa agtgtatttt aaaaacgatt ttatttaagg 79200 
tagaatattc tattcaatta tgttgctact ttttaaattg ctttaactga tttgtttttt 79260 
aatgtttgct tttttgggga aggcagggtc tattcttgag aaactaaaag atgggtagca 79320 
tagctgcact tcagataaat aagacttgga atgtatacat ttttctgatt aatttctcag 79380 
tggtttaaag gtcagaaatc tgtgtttcat caatatttta ttttattgac attgtgttag 79440 
atttttggct atttttctca gcatttatat gtccatttct gtttactttc attgttcact 79500 
gatgtgtagt ttctttcaat tcatgaactt gagttttgag gggatggggt ggagggagca 79560 
aaaatctaga caactgccaa tttaaaatgt atttcccata gaactctcca tgatactctc 79620 
accagagagt tcatttttta aataaacccc ttgcagtttt tgttcatttt agtgcactaa 7 9680 
aaaatttaat catttctctc atgaaaaact aatatcctct gaaaaacaca ccatatttcc 7 9740 
atttatgata atgccgtttc atatatggaa tgtgttttct accacataat cttaaaattg 79800 
cctcctaaca attttttaaa attaattttc attctgatta gcataagagc atatgctatt 79860 
caagtatcaa gtgacaagga aataggtacc aattcgtctc tctctttttt tttacatgat 79920 
tttagtgctc cattaaaatt taggaatgta atcaactatc cccagttcca tcctcaaaag 79980 
atctgtcatt aacatttaat aagatactgc ctctgctttt tgtgatatag ctaagcaagt 80040 
ttaatcctga acttacttgt gtttggggca tttctcctta tcgtcaaggt gcacatttgt 80100 
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ctctaatatg 
tgaatatttg 
gtttcttttt 
tttgtataca 
aatttagcat 
aacaattttt 
ataagttgta 
aagaattgct 
acttttttcc 
gtttgtatgt 
ataaacagta 
gtttcaagta 
tacaatgttt 
ttcacttttt 
atgttccaaa 
atcattttac 
gtataattcc 
tgttcccaag 
gcttacccat 
cattatcaaa 
tataatagca 
cagttcattt 
taaatgtata 
aacaaatgtt 
gattttttga 
gagcaaattg 
ctcagatgac 
tgccgcaaga 
tctgttgctt 
ttgccaaaat 
tggtttattg 
aagtctgact 
atgcagagag 
agtaaataaa 
gtaggttaaa 
ataattttga 
tattttgttt 
tgtagcaact 
attaccctgc 
ttccaagata 
gacttatgaa 
ttaaacagaa 
gtcatttttt 
cttcatagca 
acattaatgc 
gtgttgcagt 
aaaaaattat 
gtgctattag 



aatattgtga 
cttccttttg 
agaactaaaa 
tttgtaagaa 
gaaaacattt 
ataaaattgg 
agattcacat 
aaagttgaaa 
agtaacattt 
ttatcttagc 
acaatgacaa 
ttttgcttag 
ttattgccat 
aattatttaa 
taaagctgtt 
ataagccatt 
agctaagaag 
tctgagcttt 
cacaaacccc 
atggcttctt 
aaaaaagatg 
tattttagaa 
aaaatatatt 
ctattacaaa 
aatcaagatt 
atctttcctt 
tcctggtctc 
agatgattat 
tgaagactgc 
tgactcttat 
ggttggctat 
aaattcagta 
accaatggtt 
gtagcagctt 
atactcacat 
atttttaagt 
aaatataatc 
tttgtgccag 
ttgaaataca 
ttgtattccc 
aaggtcaatt 
cgccagttgc 
aagaatgcaa 
tttataatgt 
aaattaaatg 
ccctaaaata 
tacctggcaa 
cattgttt 



cctgcatgat 

catctactgg 

tggacatgaa 

tgtttttcat 

aacaaacatt 

cgctatattt 

cttgagataa 

tactcatctt 

ttatgtggtc 

atgaatttta 

caataatgat 

tgttttatat 

ttttgtacag 

gtattattca 

aataatagta 

gaaatttact 

acagatacat 

taattcctac 

ataacttttt 

tgctatttat 

tgtatagcac 

ttagatccat 

ttatttttca 

gcaaattata 

attttcatta 

atgctacttt 

catcctggat 

gggtatcagt 

agaacaggat 

tctagaatat 

ctttacttat 

gaatcttttg 

aaaatgtgaa 

tcaggctaat 

atttttactt 

tctatgattc 

tataattgtt 

aaaaaagact 

gttagttgat 

aagaagaaat 

tttacctgtc 

atttactttg 

gatgaaaaaa 

atgatgaggt 

cagagagaaa 

aaaatattga 

actgagttgc 



gctttaatct tttaaaaata 
atttttaata tttggtgact 
tatagttaaa gttgtaagaa 
aaaatattta attttgttga 
tgcatgggaa aagatgtaag 
aatgtttaga gagtgaaatt 
ccctcaaaat atgttgaatt 
ttacttgcca gcttattaaa 
taggactttg ttgttgttgc 
taatcaattg tgatataggc 
gttacctact ggcaggtgat 
ccatcatccc atttaataat 
aagaaaacat agaagccaaa 
aaatcaagta ttataaaaac 
gctattatta ctgagcatga 
tatcatagct atcacagaaa 
tggtagaatg agcagaattt 
acatcctctt tcccaaatat 
ttgtcaattc atatcttcac 
tctcattttt attaaagtcg 
tgttaatgtt ttaatataaa 
ggtaaatgat attttggtga 
ggggtatttc atttctgctt 
atataaaaca tgttataatt 
cctgtcagtc tcctagagtt 
tttgatgtca aaaattcatt 
ccactctgat tccaatgaat 
gtttggaggg caaagattgt 
cttggaggga aggaaggata 
ttttcccaac cgcttttgcc 
aaaatctact tcataagcaa 
tacttcagta acttgaagtt 
tagtattgta actattttaa 
ttacgtgaaa ctgattagtt 
aaattttctt taatttactt 
atgttttaaa gatggaataa 
ttgtaatgta agactaatta 
gttaatttgt tttttcttgc 
aacataaagc catagttttc 
tgatttattt ttaaactatc 
ttttaatcca gtccattttc 
gtgatttgca aacttggaat 
tcaggtaaaa tctaaccatt 
taacactgaa atattaaaat 
tgattatttt tttctttatg 
ggaacattgc ttgtatttac 
aggtcacatt gaggacaaaa 



tcactaaatg 

tggaatctat 

attttaaact 

cctatttcaa 

agctggttaa 

gctgaataca 

ctgaaaaacg 

ggagtcatta 

tattattatt 

ctcatcagga 

tacttgctat 

caagcaaacc 

gaataatact 

ttaatttaac 

tggagcattg 

ttaggtgcaa 

gaacaccaat 

ggtgccttgt 

aaaccttttg 

ataaataata 

atagaatata 

gtcaactttc 

aatagaatgt 

gaaaatactt 

tgcgttaaag 

tattattgtg 

aatatttctg 

gccagcttct 

cacatacgca 

ctgttcaact 

aaatcaaaag 

taaatttaaa 

ggccttcaga 

gcaaaatcca 

atatgttatt 

ttttaataca 

ctaatattta 

ttttcatttg 

ttggattttc 

agttactgaa 

tgacacaata 

gaagccacca 

ttattctctg 

ctgcaaatgc 

ttttatgaaa 

tgatgtggtt 

tcttaaatat 



80160 

80220 

80280 

80340 

80400 

80460 

80520 

80580 

80640 

80700 

80760 

80820 

80880 

80940 

81000 

81060 

81120 

81180 

81240 

81300 

81360 

81420 

81480 

81540 

81600 

81660 

81720 

81780 

81840 

81900 

81960 

82020 

82080 

82140 

82200 

82260 

82320 

82380 

82440 

82500 

82560 

82620 

82680 

82740 

82800 

82860 

82920 

82938 



<210> 4 
<211> 465 
<212> PRT 

<213> Rattus norvegicus 



<400> 4 

Met Gly Ser Gly Lys Val Phe Leu Phe Ser Pro Ser Leu Leu Trp Ser 
15 10 15 



88 



CL001006-CIP 



Gin Thr Arg Gly 
20 

Gly Asn Cys lie 
35 

Met Asn Lys Thr 
50 

Thr Gin He Leu 
65 

Pro Asp He Gly 

Asn Ser He Gly 
100 

He He Phe Ala 
115 

Thr Met Lys Val 
130 

He Pro Asp Thr 
145 

He Thr Thr Pro 

Leu Tyr Thr Leu 
180 

His Asn Phe Pro 
195 

Tyr Gly Tyr Pro 
210 

Val Glu Val Ala 
225 

Val Gly Leu Arg 

Tyr He He Met 
260 

Phe Thr He Gin 
275 

Trp Val Ser Phe 
290 

Leu Gly He Thr 
305 

Arg Lys Ser Leu 

Val Ser Val Cys 
340 

Thr Leu His Tyr 
355 

Arg Lys Leu Lys 
370 

Ser Thr Leu He 
385 

Asp Tyr Gly Tyr 

Cys Cys Phe Glu 
420 



Val Arg Leu He 

Asp Lys Ala Asp 
40 

Trp Val Leu Ala 
55 

Asn Ser Leu Leu 
70 

Val Arg Pro Thr 
85 

Pro Val Asp Pro 

Gin Thr Trp Phe 
120 

Leu Met Leu Asn 
135 

Phe Phe Arg Asn 
150 

Asn Arg Leu Leu 
165 

Arg Leu Thr He 

Met Asp Glu His 
200 

Lys Asn Glu He 
215 

Asp Pro Lys Tyr 
230 

Asn Ser Thr Glu 
245 

Thr He Phe Phe 

Thr Tyr He Pro 
280 

Trp He Asn Lys 
295 

Thr Val Leu Thr 
310 

Pro Lys Val Ser 
325 

Phe He Phe Val 

Phe Thr Ser Asn 
360 

Ser Lys Thr Ser 
375 

Pro Met Asn Asn 
390 

Gin Cys Leu Glu 
405 

Asp Cys Arg Thr 



Phe Leu Leu Leu 
25 

Asp Glu Asp Asp 

Pro Lys He His 
60 

Gin Gly Tyr Asp 
75 

Val He Glu Thr 
90 

He Asn Met Glu 
105 

Asp Ser Arg Leu 

Ser Asn Met Val 
140 

Ser Arg Lys Ser 
155 

Arg He Trp Ser 
170 

Asn Ala Glu Cys 
185 

Ser Cys Pro Leu 

Glu Tyr Lys Trp 
220 

Trp Arg Leu Tyr 
235 

He Ser His Thr 
250 

Asp Leu Ser Arg 
265 

Cys He Leu Thr 

Asp Ala Val Pro 
300 

Met Thr Thr Leu 
315 

Tyr Val Thr Ala 
330 

Phe Ala Ala Leu 
345 

Asn Lys Gly Lys 

Val Ser Pro Gly 
380 

He Ser Met Pro 
395 

Gly Lys Asp Cys 
410 

Gly Ser Trp Arg 
425 



Thr Leu His Leu 
30 

Glu Asp Leu Thr 
45 

Glu Gly Asp He 

Asn Lys Leu Arg 
80 

Asp Val Tyr Val 
95 

Tyr Thr He Asp 
110 

Lys Phe Asn Ser 
125 

Gly Lys He Trp 

Asp Ala His Trp 
160 

Asp Gly Arg Val 
175 

Tyr Leu Gin Leu 
190 

Glu Phe Ser Ser 
205 

Lys Lys Pro Ser 

Gin Phe Ala Phe 
240 

He Ser Gly Asp 
255 

Arg Met Gly Tyr 
270 

Val Val Leu Ser 
285 

Ala Arg Thr Ser 

Ser Thr He Ala 
320 

Met Asp Leu Phe 
335 

Met Glu Tyr Gly 
350 

Thr Thr Arg Asp 
365 

Leu His Ala Gly 

Gin Gly Glu Asp 
400 

Ala Thr Phe Phe 
415 

Glu Gly Arg He 
430 
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His He Arg He Ala Lys He Asp Ser Tyr Ser Arg He Phe Phe Pro 

435 440 445 

Thr Ala Phe Ala Leu Phe Asn Leu Val Tyr Trp Val Gly Tyr Leu Tyr 
450 455 460 

Leu 
465 
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